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Preface 



Analysis and its Adhesions 



Between 1946 and 1990 I had thousands of students; in the very economical 
French system with its auditoria for two hundred people or more, this was 
not difficult. On several occasions I felt the desire to write a book which, 
presupposing only a minimal level of knowledge and a taste for mathematics, 
would lead the reader to a point from which he (or she) could launch himself 
without difficulty into the more abstract or more complicated theories of the 
XX th century. After various attempts I began to write it for Springer- Verlag 
in the Spring of 1996. 

A long-established house, with unrivalled experience in scientific publish- 
ing in general and mathematics in particular, Springer seemed to be by far 
the best possible publisher. My dealings with their mathematical department 
over six years have quite confirmed this. As, furthermore, Catriona Byrne, 
who has responsibility for author relations in this sector, has been a friend 
of mine for a long time, I had no misgivings at confiding my francophone 
production to a foreign publisher who, though not from our parish, knows its 
profession superlatively. 

My text has been prepared in French on computer, in DOS, with the 
aid of Nota Bene , a perfectly organized, simple and rational American word 
processor; but it is hardly more adapted to mathematics than the traditional 
typewriters of yesteryear: greek letters, G, /, U have to be written by hand 
on the printout, something I had been doing anyway since my first 1946 
typewriter. I eventually devised a coding system, for instance [[alpha]] for 
greek letters, that made it easier to translate the NB files into TEX by using 
global commands. But apart from simple formulae in the main text, most of 
the others had to be typeset again for the French version. 

The excellent English translation has been much easier to do since 
Dr Spain, who types in TEX, had the TEX version of the French edition. 
I have taken this opportunity to make some small changes to the French ver- 
sion. 

* * * 

This is not a standard textbook geared to those many students who have to 
learn mathematics for other purposes, although it may help them; it is the 
reader interested in mathematics for its own sake of whom I have thought 
while writing. To many of the French students and particularly to many of 
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the brightest, mathematics is merely a lift to the upper strata of society 1 . 
My goal is not to help bright young people to arrive among the first few 
in the entry competition for the French Ecole polytechnique so as to find 
themselves thirty years later at the service or at the head of a public or private 
enterprise producing possibly war planes, missiles, military electronics, or 
nuclear weapons 2 , or who will devise all kinds of financial stunts to make 
their company grow beyond what they can control, and who, in both cases, 
will make at least twenty times as much money as the winner of a Fields 
Medal does. 

The sole aim of this book thus is mathematical analysis as it was and as it 
has become. The fundamental ideas which anyone must know - convergence, 
continuity, elementary functions, integrals, asymptotics, Fourier series and in- 
tegrals - are the subject of the first two volumes. Volume II also deals with 
that part (Weierstrass) of the classical theory of analytic functions which can 
be explained with the use of Fourier series, while the other part (Cauchy) will 
be found at the beginning of Volume III. I have not hesitated to introduce, 
sometimes very early, subjects considered as relatively advanced when they 
can be explained without technical complications: series indexed by arbitrary 
countable sets, the definition and elementary properties of Radon measures 
in R or C, integrals of semi-continuous functions and even, in an Appendix 
to Chap. V, a short account of the basic theorems of Lebesgue’s theory for 
those who may care to read it at this early stage, analytic functions, the con- 
struction of Weierstrass elliptic functions as a beautiful and useful example 
of a sophisticated series, etc. 

I have tried to give the reader an idea of the axiomatic construction of set 
theory while hoping that he will take Chap. I for what it is: a contribution to 
his mathematical culture aiming at showing that the whole of mathematics 
can, in principle, be built from a small number of axioms and definitions. But 
a full understanding of this Chapter is not an obligatory prerequisite to an 
apprenticeship in analysis . The only thing the reader will have to retain is 
the naive version of set theory - standard operations on sets and functions 
to which, anyway, he will get used by merely reading the next chapters - 
as well as the fact that, even at the simplest level, mathematics rests upon 
proofs of statements, an old art which, in French high schools and probably 
elsewhere as well, is in the process of becoming obsolete because, we are told, 
learning to use formulae is much more useful to most people, or because it is 
too difficult for the many children of the lower strata of society (I was one in 
the 1930s) who now flood the high schools . . . 

1 In XIX th century Cambridge, the winners of the Math Tripos would far more 
often become judges or bishops than scientists. 

2 One of the brightest students I have known in thirty-five years is today the head 
of a holding company that controls, among other things, a chain of supermar- 
kets. He sells Camembert, shrink-wrapped meat, Tampax, orange juice, noodles, 
mustard, etc. If you have to choose, this is a more civilised way to squander your 
grey matter. 
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The sequel, in Volumes III and IV, explains subjects which require either 
a much higher level of abstraction (short introductions to differential vari- 
eties and Riemann surfaces, general integration, Hilbert spaces, general har- 
monic analysis), or, in the last Chap. XII, a much higher level in computation 
techniques: Dirichlet series of number theory, elliptic and modular functions, 
connection with Lie groups. While the choice of material in Volumes I to 
IV represents a coherent and nearly selfcontained block of mathematics, it 
constitutes nothing more than one particular view of analysis. Other authors 
could have chosen other views and, for instance, tried to lead their readers 
into the theory of partial differential equations. I have not even treated dif- 
ferential equations in one variable: one can learn all about them in a myriad 
of books, and the classical results of the theory, direct applications of the 
general principles of analysis, should pose no serious problem to the student 
who has assimilated these reasonably well. 

In the two first volumes - Volumes III and IV are written in a much 
more orthodox fashion - I have firmly emphasised, sometimes with the 
aid of out of fashion excurses in ordinary language, the ideas at the basis of 
analysis, and, in some cases, their historical evolution. I am not, far from it, 
an expert in the history of mathematics; some mathematicians, sensing their 
end coming, devote themselves to it late in life; others, younger, consider the 
subject sufficiently interesting to devote a substantial part of their activity 
to it; they perform a most useful task even from the pedagogical point of 
view 3 since, at twenty, which I once was, one thinks only of forging ahead 
without looking behind, and almost always without knowing where one is 
going: where and when will one learn? I have myself preferred for a quarter 
of a century to take an interest in a kind of history - science, technology, and 
armaments in the XX th century - for which mathematics does not prepare 
one, though there are some indirect connections. Nevertheless I have made 
some effort to convey to the reader that the ideas and the techniques have 
evolved, and that it took between one and two centuries for the intuitions of 
the Founding Fathers to be transformed into perfectly clear concepts founded 
on unassailable arguments, awaiting the great generalisations of the XX th 
century. 

Adopting this point of view has led me, in these first two volumes, sys- 
tematically to eschew a perfectly linear exposition, organised like a clockwork 
and only presenting to the reader the dominant or a la mode point of view, 
with assorted Blitzbeweise , lightning proofs in the sense in which we speak of 
Blitzkrieg 4 : one ratifies the result but does not comprehend the strategy until 
six months after the battle. At the cost of proving the same classical results 
several times over I have tried to present several methods of arguing to the 

3 E. Hairer and G. Wanner, Analysis by Its History (Springer-New York, 1996), is 
a prime example. 

4 Rene Etiemble, a great French specialist of comparative literature, once made 
a study of the styles prevailing in various kinds of activities. He came to the 
conclusion that mathematical style was the closest there was to the military. 
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reader, and to make clear the necessity of rigour by evidencing the doubtful 
arguments, and sometimes false results, due to mathematicians like Newton, 
the Bernoullis, Euler, Fourier, or Cauchy. Adopting this point of view length- 
ens the text palpably, but one of the ground principles of N. Bourbaki - no 
economies of paper - is, I think, mandatory when one addresses students 
embarking on a subject. 

The other principle of this same author - to substitute ideas for com- 
putations - appears even more commendable to me whenever it can be 
applied. All the same, one will, inevitably, find calculations in this book; but 
I have essentially confined myself to those which, inherited from the great 
mathematicians of the past, form an integral part of the theory and can be 
considered as ideas. 

Except occasionally, to round off the text, one will find no exercises here. 
Working at exercises is indispensable when one learns mathematics, and one 
will find them in profusion in many other books and specialised collections. 
The majority of French students, obsessed by the string of examinations 
imposed on them, have a very exaggerated tendency to consider the “lectures” 
of little use and that only “practical work” and “formulae” count or pay. The 
result is that the majority of them are able, up to errors in calculation, to 
integrate a rational function but incapable of answering questions of a general 
nature, e.g. why is a rational function integrable? To understand a theorem 
is to be able to reconstruct its proof. To understand a block of mathematics 
does not reduce to knowing how to apply its results; to understand a theory 
is to be able to reconstruct its logical structure. Every mathematician knows 
this. 

One does not learn analysis or anything else from one single book; there 
is neither Bible, nor Gospel nor Koran in Mathematics. The fact that the 
spirit of my book is radically different from that of Serge Lang, Undergradu- 
ate Analysis (Springer, 2nd. ed., 1997) for example, should not dissuade from 
reading it, quite the contrary; even less the books of E. Hairer and G. Wan- 
ner, Analysis by Its History , Wolfgang Walter, Analysis I (Springer, 1992, 
in German) or Reinhold Remmert, Theory of Complex Functions (Springer- 
New York, 1991, translation of Funktionentheorie 1 , 4. Auflage, 1995), which 
I have often used, and cite when I do so. These excellent books present nu- 
merous exercises, as does Jean Dieudonne’s Calcul Infinitesimal (Hermann, 
1968) though his style enthuses me less. 

I have not acceded to the new fashion which likes to decorate elementary 
analysis textbooks with numerical calculations to fifteen decimal places un- 
der the pretext they will be useful to future computer scientists or applied 
mathematicians. Everyone knows that the mathematicians of the XVII th and 
XVIII th centuries loved numerical computations - done by hand, not by 
tapping the keys of an electronic gadget - that enabled them to verify 
their theoretical results or to demonstrate the power of their methods. This 
childhood sickness of analysis disappeared when in the XIX th century one 
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began addressing rigour of proof and generality of formulations, rather than 
formulae. 

This does not mean that numerical calculations have become pointless: 
thanks to computers, one can do more and more of them, for better or worse, 
in all scientific and technical areas that, from medical imaging to the per- 
fecting of nuclear weapons 5 , use mathematics. One does the same in certain 
branches of mathematics too; for example, displaying a large number of curves 
may open the way to a general theorem or to understanding a topological 
situation, not to speak of the traditional number theory where numerical 
experiment always was, and still is, used to formulate or verify conjectures. 

This only means that the aim of an exposition of the principles of anal- 
ysis is not to teach numerical techniques. Moreover, the partisans of applied 
mathematics, of numerical analysis and of computer science in all the uni- 
versities of the world manifest their imperialist tendencies far too clearly for 
real mathematicians to take on in their stead a task for which they generally 
lack both taste and competence. 

* * * 

The innocent reader and many confirmed mathematicians will probably be 
surprised, possibly even shocked, to find in my book some very heavy allu- 
sions to extramathematical subjects and particularly to the relations between 
science and weaponry. This is neither politically nor scientifically correct: Sci- 
ence is politically neutral 6 , even when someone lets it fall inadvertently on 
Hiroshima while the future winner of a Nobel Prize in physics is recording the 
results in a B-29 trailing the Enola Gay 7 . Nor is it part of the curriculum: a 

5 Nuclear powers agreed a dozen years ago to stop testing. The reason why was 
that testing was made unnecessary by the improvement of numerical analysis. 
The most immediate consequence of this “progress” is that everything is now 
done in full secrecy, which was not the case when they had to propel into the 
stratosphere two million tons of radioactive rock and sand in order to check their 
“gadgets” . 

6 An assertion long since demolished by countless studies, notably American, ei- 
ther of particular facets of scientific activity, or of Science in a general way, 
e.g. in Bernard Barber, Science and the Social Order (Collier Books, 1952) and 
Jean-Jacques Salomon, Science et Politique (Paris, Ed. du Seuil, 1970, reed. 
Economica). Disclosing the influence of politics, for instance of WW II and the 
Cold War, upon Science and Technology is not the same as going in politics as 
so many scientists believe without ever having read any serious historical work. 
And I do not see why being opposed to the military exploitation of mathematics 
and science should be considered as a more political stand than, for instance, 
helping Los Alamos or Arzamas to develop their “weapons of genocide” was. 

7 In Alvarez: Adventures of a Physicist (Basic Books, 1987), Luis Alvarez’s trip 
to Hiroshima is the very first thing he relates in his book. He was also one of 
the main proponents of the H-bomb and, at the end of October, 1949, went to 
Washington to lobby in favor of it. In 1954, he testified at the Oppenheimer 
security hearing that Oppenheimer ’s opposition to the H-bomb was proof of an 
exceedingly poor judgment . Alvarez is one of many similar counterexamples to 
the “neutrality of Science” theory. 
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scientist’s business is to provide his students or readers, without commentary, 
the knowledge they will later use, for better or for worse, as suits them. It 
will be up to them to discover by themselves, possibly years after graduating, 
that which “has no place” (why, please?) in scientific books or lectures and 
which was not told them by older scientists well aware of it, or who should 
have been. Let me give you a few French examples. 

As a dozen of people who mostly work in particle physics assured me, 
you can spend five or eight years learning physics without ever having heard 
anything about nuclear weapons. I once checked the chemistry library of my 
Paris university for books by Louis Fieser, a most eminent Harvard chemist 
who, during WW II, was in charge of improving incendiary weapons and 
developed napalm; all of his chemistry books are there, but not his account 
of war work 8 . 

I found another, particularly caricatural, example in a textbook of physics 
for high-school finishing students; as required by the French official instruc- 
tions in 1995, the concluding chapter, on the laser (never mind that an eigh- 
teen year old boy or girl can understand practically nothing to it), mentioned 
a number of civilian applications - ophthalmology, measure of atmospheric 
pollution, compact discs, energy production by laser-induced thermonuclear 
fusion 9 , etc. - but not a single military use of lasers, a domain in which 
French industry was always very strong. This is not only dishonest; it is a 
foolish way to hide the truth since students read newspapers, look at TV, 
and if they type “laser military history” on www.google.com, they will get 
about 164,000 documents! 

Thirty years ago, in part under the influence of what I had seen on Amer- 
ican campuses and read in American newspapers and such reviews as Science 

8 Louis Fieser, The Scientific Method . A Personal Account of Unusual Projects in 
War and in Peace (Reinhold, 1964). In the Biographical Memoirs, v. 65, 1994, 
of the (American) National Academy of Science, Fieser’s biographer has this to 
say (p. 165) about his work during WW II: “With the approach of World War II, 
Fieser was drawn increasingly into war-related projects. A brief excursion into the 
area of mixed aliphatic- aromatic polynitro compounds for possible use as exotic 
explosives was followed by studies of alkali salts of long chain fatty acids as 
incendiaries, but by far the most important of his war-related work was his long 
and intensive study of the quinone antimalarials ” , to which the author devotes 
one full page. The word “napalm” is nowhere to be found in this fourteen-page 
biography, a beautiful example of the art of fooling the reader with opaque 
technical jargon. All the more remarkable since Fieser was strongly criticised 
during the Vietnam War for his development of napalm. In his long biography 
of von Neumann in the Dictionary of Scientific Biography , J. Dieudonne devotes 
two lines to what he calls his “government” work without telling us whether 
it had to do with, say, the H-bomb or cancer research, two strongly supported 
domains of “government” work. 

9 This is a very long term project, but the French and American military have 
justified this very expensive enterprise by pointing out that the new knowledge 
of fusion processes it will provide will be used to improve nuclear weapons, a fact 
that is of course not mentioned in the textbook. 
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and the Bulletin of the Atomic Scientists , I succeeded, to my great aston- 
ishment, to convince the head of my Paris university library to start a new 
section that would be devoted to what was then called in America Science 
and Society studies. Although it received little money, you can now find there 
several thousands of (mostly American) books and the main reviews in the 
history of science and technology, including the military side of it, the arms 
race, economics of research and development, science policy, etc.: no exclu- 
sive. But almost all the readers are people who specialise in that field, while 
most of the 5,000 scientists working at the university don’t even know the 
existence of this library. Since their specialised libraries are practically empty 
in this respect, the conclusion is inescapable: their only sources of information 
are their generally narrow personal experience 10 , perhaps some historical ar- 
ticles written in scientific reviews by scientists who have no idea of historical 
writing 11 , and cafeteria conversations: 

The humanist who looks at science from the point of view of his own 
endeavours is bound to be impressed, first of all, by its startling lack of 
insight into itself. Scientists seem able to go about their business in a state 
of indifference to, if not ignorance of, anything but the going, currently 
acceptable doctrines of their several disciplines . . . The only thing wrong 
with scientists is that they don’t understand science. They don’t know 
where their institutions come from, what forces shaped and are still shaping 
them, and they are wedded to an antihistorical way of thinking which 
threatens to deter them from ever finding out 12 . 

It appears more honest to me to violate these miserable and far too comfort- 
able taboos and to put on their guard those innocents who leap into the dark 
into careers of which they know nothing. Because of their past and potential 

10 It is not always that narrow. As in the USA - the model - there are in 
France scientists who have been for a long time in top government committees 
or who have cooperated with industry. They obviously know a lot more than the 
average researcher, let alone student. But they mostly don’t speak, much less 
write, particularly when defence activities are involved. This striking difference 
between French and American “Statesmen of Science” can perhaps be explained 
by the fact that the political spectrum extends much farther to the left in France 
than in the USA, so that defence work was, at least during most of the Cold 
War, much more controversial here than on the other side of the Atlantic. 

11 One of the books I have recently read is Gregg Herken, Brotherhood of the Bomb : 
The Tangled Lives and Loyalties of Robert Oppenheimer, Ernest Lawrence, and 
Edward Teller (Henry Holt, 2002), a superb though very concentrated book. 
The main text, 334 pages, is followed by over 2,000 notes: an average of six 
references to sources per page (and a lot more on Internet). No active scientist 
could spend ten years reading two hundred books and papers already published, 
interviewing at length eighty colleagues, discovering and reading hundreds of 
recently declassified government files, and organizing this amount of information 
into a coherent book. 

12 Eric Larrabee, Science and the Common Reader (Commentary, June 1966). As I 
said above, old scientists who have long been top consultants to their government 
are not as innocent as Larrabee puts it, but the new generation has not their 
experience of science politics. 
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catastrophic consequences, the connections between science, technology and 
armaments concern all who go into science or technology or who practice 
them. They have been governed for half a century by the existence of public 
organisations and private enterprises whose function is the systematic trans- 
formation of scientific and technological progress into military progress within 
the limits, often elastic, of the economic capacities of the various countries 
which take part in it: 

With the attention which is paid in these days to weapons of war, there is 
probably no known scientific principle that has not already been carefully 
scrutinized to see whether it is of any significance for defence 13 . 

In countries - France is a prime example - where discussions on the 
relations between Science and Defence have been dominated for decades first 
by silence, then by a thick consensus 14 , and have been totally absent from 
university teaching 15 , the thing to say to young people is that one of the forms 
of intellectual liberty is not to let oneself be dominated by the dominant ideas. 

But this requires access to other information sources. It would be impossi- 
ble to thoroughly discuss this subject and its history within the framework of 
a mathematical treatise. I nevertheless decided to write a few dozen pages - 
the Postface to Volume II - in order to give the interested reader an idea of it 
and, in particular, to show that the question and the subject do exist. I have 
not balked at citing a good number of important bibliographical references 

13 Sir Solly Zuckerman, Scientists and War (London, Hamish Hamilton, 1962, 
p. 80); the author was at the time the British Government chief scientist and 
had formerly been the head of British military research. There is no reason to 
believe that Zuckerman’s statement is no longer valid, particularly in America. 

14 “Science et Defense” is the title of a French association founded in 1983 by 
Charles Hernu, then the (socialist) Secretary of Defence and future hero of the 
Greenpeace affair - the clumsy sinking in Auckland harbour by French agents of 
a ship that would have interfered with a French nuclear test in the Pacific. Sup- 
ported by the Armament branch of Defence, the association organises a yearly 
congress, where, over two days, engineers and scientists lecture on the techni- 
cal problems of armaments and the closely related sciences. Several hundreds 
of people attend: military, engineers, industrialists, scientists, and, inevitably, 
political scientists and metaphysicians of strategy. France is, to my knowledge, 
the only country where what a number of American historians now call the 
scientific-military-industrial complex dares to exhibit itself so publicly and with- 
out provoking the least reaction. This would not have been possible before the 
conversion of the Socialist and Communist parties to nuclear weapons when, at 
the end of the 1970s, they saw a good prospect of winning the 1981 presidential 
election. 

15 America was, in the 1970s, a notable exception to this general statement: stu- 
dent protests against the Vietnam war and the cooperation of many university 
departments or laboratories with the DoD led some universities to add to their 
curriculum lectures on various aspects of “Science and Society” that attracted a 
sizeable number of science students, while some teachers in the history of science 
saw their audience suddenly grow. Although the traditional back to normal pro- 
cess did not take very long, many of the present generation of specialists found 
their calling during this period. 
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- there are plenty more - which will allow those who so desire to complete, 
verify, or discuss this text. I do not have the naive hope that a twenty year 
old student of mathematics will plunge into this ocean of literature; it would 
hardly even be a very good service to encourage him to do so. But maybe 
this text will find readers who are not so young and no longer have to submit 
to examinations or competitions for success. Although the French version of 
this Postface devoted a good deal of space to the French situation, I thought 
it better, in the English version, to emphasise the American situation more 
than I did in French, and this for several good reasons. 

From Pearl Harbor to the present day, America has been the world leader 
in this domain - a leader which, for a dozen years, has no longer had 
any competitor worth naming and seems to be in a technological arms race 
against itself 16 , as was already the case when it spent $2 000 000000 (one 
percent of its 1945 GNP) during WW II in order to get the atomic bomb be- 
fore the Nazis, who did not believe it could be available in time and devoted 
very little resources to it. This American polarisation on scientific weapons, 
more or less faithfully imitated in the Soviet Union, Britain and France, 
had enormous political consequences; among others, it compelled the much 
weaker Soviet Union to devote to defence a proportion of its resources which 
must have strongly contributed to its downfall and to the present American 
hegemony. On the other hand, the civilian uses of mostly American military 
innovations in electronics, informatics, aviation, space, telecommunications, 
nuclear power, etc., had a deep influence on the daily life of people every- 
where. Without WW II and the arms race, most of these innovations would 
have come much later, or never, because the financing of research, develop- 
ment and initial production by defence organisations made it possible for 
private enterprises to take risks which, otherwise, would have been barred by 
the return on investment principle that governs civilian innovations. Without 
World War II, no V-2 missiles and no atomic weapons; without these and the 
Cold War, no intercontinental ballistic missiles; without ICBMs and the need 
of the central military authorities for instant worldwide command, control, 
communication, and intelligence - C 3 I as they call it - no satellites; and 
without satellites and many other innovations propelled by the military - 
computers, integrated circuits, Arpanet, etc. - then no Internet, to mention 
only this most spectacular spin-off of the arms race. The idea that civilian 
industry could have, by itself, spent tens or hundreds of billions in order to 
invent, produce and market such gigantic amounts of hardware and software 
at a time when nobody but the military had any proven need for it is foolish. 
Civilian business does not deal in science fiction. 

WW II and the arms race also contributed to propelling the funding of 
scientific research proper to levels which, before 1939, would have seemed 

16 It has been recently disclosed that America will develop in the next 10 or 15 
years an hypersonic cruise missile that will be able to strike anywhere on the 
Earth in less than two hours from bases in continental America. 
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unrealistic in the utmost, a fact of which scientists everywhere were the first 
beneficiaries although never nearly as much as American ones 17 . This not only 
made it possible for many more young Americans to choose scientific careers 
than was the case before WW II, it also attracted to America many scientists 
(and still more engineers) who had been educated elsewhere, a process that 
is continuing to this day - the famous brain drain that was first noticed 
in the 1950s, not to mention some Russian immigrants after 1917 and the 
European Jews in the 1930s. 

* * * 

The French version of this book included many citations and references in 
English, particularly in the Postface to Volume II, this in order to encourage 
the reader to use a language that is absolutely indispensable if one wants 
to inform oneself on anything at all: for clear demographic reasons France 
accounts for only a small proportion of the literature, for instance from 3% 
(technology) to 7% (mathematics) in the sciences on the world scale, and 
although French authors publish excellent books in many domains, scientific 
or not, they cannot be leaders everywhere. There is for instance nothing of 
any value on the history of nuclear weapons, not even of French ones, and 
none of the best American books have been translated. Almost all I know in 
the Science and Defence domain has been learned from American authors, 
although a few French historians of Science and Technology are beginning to 
deal with it. 

There is no need to suggest to readers of the English translation of the 
present book to learn English. One should however warn the beginner that, 
even though well over 60% of the mathematical literature is now in English, 
an ability to read French is, at the research level, still needed. Since 1945, the 
Fields Medal has been awarded to 44 people worldwide; seven of them were 
French, and two more, although they are not, did all their previous work in 
France; the first Abel Prize (a recently created substitute to the nonexistent 
Nobel Prize for Mathematics) has been awarded in 2003 to Jean-Pierre Serre, 
who won a Fields Medal in 1954, and others won for instance the Wolf Prize. 
There are in France many more excellent mathematicians than these stars; 
although some publish in English, still many write in French. And there are 
of course German and Russian authors, among others, who still publish in 
the one language they learned as infants, as anglophone authors always did. 

17 In 1965, Isidor Rabi, a winner of the Nobel Prize in physics, pointed out that the 
budget of the Columbia University physics lab had grown from 15,000 dollars be- 
fore the war to three millions and attributes this to the war which “ did wonderful 
things in some respects Hans Bethe, another Nobel Laureate, remembered in 
1962 that before WW II he found it difficult to get some $3,000 for a cyclotron 
at Cornell, but that, “ Today, $3,000 is pin money. We use it in this laboratory 
in a day” . To be objective, one should also note the fantastic increase of civilian 
research funds allocated to Life Sciences, mainly Biology and Medicine; but even 
in this case, it was WW II, especially the development of penicillin, which at the 
start demonstrated what could be done in these fields with enough money and 
a concerted effort. 
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You fortunately don’t need to learn Japanese: Japanese authors do not use 
it at the international level, a most courteous stand when you think that, 
for them, learning English is a lot more work than learning French is for the 
American, or English for the French. 

The fact that English has acquired almost the status of an international 
common language, or lingua franca, has of course its upside, and any other 
reasonably widespread language, as Latin was three centuries ago, would 
do. The prevalence of English is often explained by the fact that it is sup- 
posedly simpler than, say, German, French, or Russian, and that anyway 
anglophones now form a large proportion of scientists at the world level. 
As suggested above, this preponderance of English, which goes far beyond 
Science, is also, and possibly mainly, a corollary of the enormous resources 
American government, industry and private foundations have devoted to Sci- 
ence and Technology since the 1940s and more generally of the overwhelming 
superiority of the American economy 18 . 

There is in France, and probably elsewhere too, a theory according to 
which, thanks to the overwhelming power America acquired in 1945 and still 
more in 1990, the result, or even purpose, of the “invasion of English” is to 
spread across the whole world the American conceptions of society, politics, 
economy, technology, mass media, etc. and to help American enterprises to 
acquire larger and larger parts of foreign markets everywhere, a process that, 
although or because successful, meets strong opposition in many countries. 

Although greatly reinforced by WW II, it started much sooner. The use 
in America of such typical expressions as “richest in the world” , “greatest in 
the world” , “tallest in the world” , “fastest in the world” , “first in the world” , 
etc. was already widespread in the 1900s and was a plain enough symptom. 
Standard Oil, General Electric, Ford were models of multinational companies 
that European enterprises tried (generally without much success at the time, 
if you except I.G. Farben in the 1920s) to imitate. American sewing machines, 
typewriters and accounting machines, agricultural machinery, machine tools 
and, between WW I and WW II, automobiles were invading Europe long 
before computers did. In the 1920s, jazz had already its fanatics everywhere, 
Hollywood’s movies had already 60-80% of the French market, most of the 
best movie theaters were in American hands, and the answer to French at- 
tempts to impose import quotas was a near total boycott of French movies 
in America (19 in 1929, against hundreds of American movies in France), 
a situation which did not improve after WW II. After 1918 it was Wilson, 
a U.S. President with the mind and eloquence of a Protestant missionary, 
who launched the Society of Nations, which Congress rejected. The United 
States’ interventionist policy was already quite plain in the Americas, China 
and Japan long before the end of the XIX th century, and as a recent book 19 

18 In some French companies, meetings of the Board are in English because of the 
presence of one or two American members. At the present time, about 20% of 
the total capitalisation of the Paris Stock Exchange is American- owned. 

19 Philippe Roger, L’ennemi americain. Genealogie de Uantiamericanisme frangais 
(Paris, Seuil, 2002, 600 pp.) puts everything in historical perspective without 
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reminded us, French hostility toward America was powerfully increased by 
the war against Spain in 1898, which was viewed by the French right as a 
threat to European colonial empires, and by the left as a conclusive proof of 
the transformation of an already unpalatable American capitalism into out- 
right imperialism or economic colonialism. As to the present American taste 
for firearms, a unique feature among “civilised” countries as they were called 
in the 1900s, it was Samuel Colt who, during the war against Mexico, trig- 
gered the craze by adopting the American system of manufacture invented 
in arsenals in order to mass produce his celebrated revolvers. Present in- 
equalities in the distribution of income are not worse than they were at the 
time John D. Rockefeller was worth over one billion dollars, i.e. about 2 % of 
America’s GNP: a proportion which, nowadays, would amount to some 200 
billion. And New York bar owners pouring French wine on the street 20 were 
seen already long before March 2003. 

Thus, nothing very new under the sun, except that American interna- 
tional preponderance and unilateralism have now acquired the status of an 
official doctrine supported by a host of ideologists invoking a fundamental- 
ist Protestant ethic in order to justify interventions which, in the eyes of a 
vast majority of people everywhere, are nothing but displays of power even 
when they rid nations of barbaric rulers or religious oppression in the hope 
of establishing there a (probably very weak) version of Western democracy. 

That being said, nobody has to appreciate the barbaric music and violent 
movies which presently come from America (the American stars of my youth 
were Charlie Chaplin, Buster Keaton and the Marx Brothers). Americans do 
not merely dictate the export of these productions through international com- 
mercial agreements and by owning big distribution companies; they also sell 
them by finding indigenous customers (or imitators) who are only too happy 
to make money by distributing them among a young and most often uncul- 
tured public. And how would local television fill its hours of programmes, 
how could the cinemas function, without the flow of American productions? 
The work force in France (say) is not large enough to replace American medi- 
ocrity with French mediocrity; and no country is capable of producing a new 
Shakespeare or a new Bartok every day. One therefore broadcasts what is 
available or imitates American crass “games” . 

himself falling into the trap. It goes without saying that many of the criticisms 
that some French intellectuals and politicians of the right addressed to America 
would apply just as well to France. Howard Zinn, A People’s History of the United 
States, 1492-Present (Harper Collins, 2003 edition), while or because very one- 
sided, would be very useful to help understand criticism from the French left, 
which was never as systematic and well organised as Zinn’s, not to mention books 
by Lewis Mumford, Noam Chomsky, etc. 

20 If a boycott of French wines were to lower prices in France, I, for one, would 
not shed crocodile tears at the tragic fate of poor American patriots heroically 
depriving themselves of Chateau Latour at $1,000 a bottle (assuming they don’t 
have a stock of it in their cellar). 
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Nor is one obliged to approve the Darwinian concepts of economic com- 
petition and social relations which, thanks to technologies that have emerged 
straight from the cold war and arms race, are presently expanding under 
the name of “globalisation”: the extension to the planet of a “liberal”, i.e. 
capitalist, and “modern” economic system founded on the principles isolated 
by Adam Smith in 1776 and assimilated erroneously by the robber barons 
who, at the end of the XIX th century, erected the great American capitalist 
enterprises, afterwards revised a little and codified. It is now forbidden to 
shoot the strikers but not to domesticate the unions; to dismiss thousands 
of employees to improve the competitiveness of companies and in return to 
exploit the work force at low pay in developing countries; to push for the 
dismantlement of European social welfare systems hard won after a century 
of struggle but now judged too expensive - or smacking of Socialism? - by 
the alumni of the Harvard Business School and its foreign imitations; to sub- 
orn the public markets by handing cheques to political parties as is presently 
the case in France, Germany, Italy, etc., or, in the Third World, to gangsters 
in high places in order to inundate the countries they rule or own with killing 
machines under the pretext of lowering the unit price for the countries that 
produce them 21 , or in order to secure the rights to exploiting their natural 
resources. It is the reign of money, whose rallying slogan was launched a hun- 
dred and fifty years ago by a famous French minister: Enrichissez-vousl If 
you can zz . . . 

That said, America possesses, notably in its universities, an intellectual 
class not to be globally confused with the spokesmen of the Pentagon’s war- 
lords or the operators of Wall Street. In particular and as I said above, no one, 
in France, has revealed the military influence on scientific and technological 
development since 1940 as a number of American historians, particularly of 
the younger generation, have done for a quarter of a century with the help 
of massive documentation; if you are interested in, say, the history of the 
Cold War, you will find in the American literature all the information, points 
of view and opinions you want. There is no need either to point out that 
many American novelists did not wait until 2003 to disseminate unortho- 
dox descriptions of the American society. As to the mathematicians, many 
of whom have always been very critical of official policy, the years I spent 
with my family in the 1950s and 1960s at Urbana, Berkeley and Princeton 
were among the happiest of my life. And when, at the end of October 1961, 

21 A few years ago, the sale to Taiwan by the French company Thomson-CSF of 
very sophisticated frigates generated a $500 million return for (mostly unknown) 
Taiwanese and French politicians or political parties and go-betweens. An inves- 
tigation of the case by the French judiciary was stopped in a most elegant way: 
the Department of Defence classified all Thomson-CSF documents pertaining to 
it. The company decided a few months later to change its name into Thales, a 
rather unpalatable reference to Mathematics. 

22 What Guizot said is: Get rich, by your work and savings - a cynical precept at 
a time when the overwhelming majority of people, after working twelve hours a 
day six days per week, would die as poor as their parents were. 
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my Paris flat was destroyed 23 because I had, rather mildly in fact, spoken 
out during a lecture against the savage repression in Paris of a peaceful Al- 
gerian demonstration for independence, I received two days later a telegram 
from J. Robert Oppenheimer inviting me, on very generous terms, to spend 
the remainder of the academic year at the Princeton Institute; we went two 
months later, and there my wife recovered her usual balance. 

It goes without saying that the facts and opinions to be found in this book 
are my own and full responsibility. They do not commit Spring er-Verlag to 
any degree. Some will perhaps reproach my publisher for not having censored 
me. Being ill-placed to do so in their place I prefer, for myself, to thank 
them warmly for having been allowed the liberty to express myself. This is 
an attitude which I surely would not have encountered everywhere and which 
I appreciate for its proper worth. 



Paris 2003. 
Rgodement@aol.com 



23 My wife, who was there, was extremely lucky not to be hurt, though she was 
badly shaken for months afterwards; my three children came home from school 
fifteen minutes after the bombing. The very first thing I noticed after climbing a 
ladder to my place was the police searching my papers (there was nothing to be 
found). Though they certainly identified the authors of this attempt - probably 
candidates to the French equivalent of West Point school, according to my own 
students - I was never told anything and I very much doubt that they ever had 
to bear any unpleasant consequences. 
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I — Sets and Functions 



1. Set Theory - §2. Logicians , Logic 



It is generally understood that mathematicians concern themselves with ob- 
jects or concepts about which they establish theorems by applying logical, 
preferably irrebuttable, arguments. This last detail, which remained theoreti- 
cal for a long time, apart from in the theory of numbers and in the elementary 
geometry inherited from the Greeks, has always scared the majority of people, 
accustomed as they are to produce and hear daily, even in exalted intellectual 
activities, tens of arguments each more contestable than the other: life in so- 
ciety would be impossible if everyone had to provide incontrovertible proofs 
of his assertions and to express himself clearly and without ambiguity 1 . 

Up to the end of the XVII th century the objects of mathematics were 
numbers, geometrical figures, equations or functions more or less directly 
arising from everyday life, astronomy or mechanics: the triangles and conic 
sections of the Greeks, the whole numbers, rationals, or irrationals like \/2, 
the simplest algebraic equations which one sometimes tried, like Fermat, to 
solve in terms of integers, the trigonometric functions which had been exten- 
sively developed by the Greek and Arab astronomers before the Westerners, 
Napier’s logarithms, Galileo’s parabolas, the velocity of a moving body and 
the calculus of tangents to a curve, which, in the second half of the XVII th 
century, led to the concept of the derivative, the computation of the area 
bounded by a curve - the circle or the parabola in the case of Archimedes - 
which in the same period led to the integral calculus etc. Although still very 
primitive until about 1600, a short while later mathematics exploded thanks 
to the creation of the infinitesimal calculus by Fermat, Descartes, Huyghens, 
Wallis, Cavalieri, and, above all, between about 1665 and 1720, by New- 
ton, Leibniz and the Bernoullis. This opened an epoch, when, by the new 
methods, they solved an amazing number of problems without worrying too 
greatly about the validity of their proofs; the then Prince of Mathematicians 
was Leonard Euler (1707-1783), a man who “calculated as he breathed”, and 
was so inventive that he not only discovered innumerable formulae which are 
still useful, but - and this is far more difficult - also provided proofs, almost 

1 See various comments on this point in Chap. II, n° 7. 
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always shaky, even though the results were themselves correct. This way of 
doing mathematics reached its apogee at the beginning of the XIX th cen- 
tury with Joseph Fourier and his trigonometric series, without which a large 
part of present day mathematics and physics would not have been possible. 
Fourier’s “proofs” are not only invalid; they are also unmeaningful, resting 
as they do, even if not explicitly, on equations as absurd as 

1-3 + 5-7 + ... = 0, 
l 3 - 3 3 + 5 3 - 7 3 + . . . = 0, 

l 5 - 3 5 + 5 5 - 7 5 + . . . = 0 

etc. His results, here again, are nevertheless correct; his first formula is the 
series for the “square wave” as used by all electricians. His general theorems 
on periodic functions were soon correctly proved by quite other methods 
by more serious mathematicians, Abel and above all Dirichlet, who were the 
first, with Cauchy, to introduce rigour and precision into analysis in the 1820s; 
nevertheless Fourier had seen all the simple and fundamental results and had 
invented a method which, right up to our days, has been unceasingly exploited 
in situations far more general and difficult than he could have known. 

One now enters progressively into a new age, when, particularly in Ger- 
many, while waiting for the French and Italians at the end of the XIX th 
century, mathematicians systematically put to the test all the concepts - 
limits, convergence, irrational numbers, continuity, differentiation, integra- 
tion, etc. - which involve, implicitly and more often explicitly, an infinite 
number of operations, and so were not defined in a perfectly clear and un- 
ambiguous way; they challenged all the “geometrically obvious” statements 
which were incorrectly proved and even sometimes false if taken literally, as 
was finally understood. Sometimes apparently monstrous creatures arrived 
on the scene, notably in connection with Fourier series, which continued to 
have surprises in store, and still, in 1997, pose very difficult problems: dis- 
continuous functions which jump brusquely and at all rational values of the 
variable, continuous curves not admitting even a single tangent, functions 
which one did not know how to integrate, not because the “formula” was 
not known, but because they eluded all known definitions of an integral, 
continuous trajectories which passed through all the points of a square, etc. 

On the other hand, there is a branch of mathematics which has always 
escaped these crises because the infinite plays no role there: arithmetic, or 
algebra, and particularly the classical theory of numbers and of algebraic 
equations, particularly the study of “algebraic” numbers, that is, roots of 
polynomial equations with integer coefficients, to which one attempts to gen- 
eralise such classical results as decomposition into a product of prime factors. 
Carl Friedrich Gauss, who did many other things as well as mathematics, con- 
sidered arithmetic so understood to be the “Queen of the Sciences” ; he himself 
was the King of Arithmetic between 1800 and 1830 . . . The study of these 
numbers presents enormous methodological difficulties - it took decades of 
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efforts by several mathematicians of the first rank before Dedekind discovered 
after 1870 what has become the guiding thread up to our days, the theory of 
ideals but the results obtained, though still partial and limited in scope, 
and the methods used, left no doubt: the logic of proof was unassailable. 
One knew exactly what one was speaking of when proving a theorem, even 
though, to be sure, not all of the very complex mechanism which governs 
the properties of algebraic numbers had yet been discovered. The problems 
of arithmetic call for the detective’s art; those posed by the fundamentals of 
analysis are rather a matter for philosophical reflection. 

The main part of these developments in arithmetic was due, here again, 
to the Germans influenced by Gauss; he himself de facto founded a dynasty 
which ruled the subject for most of a century. It is hardly surprising that, in 
these circumstances, other Germans, sometimes they themselves (Dedekind), 
little by little elaborated a program which was named to arithmetise analysis , 
as Felix Klein put it in 1895: in other words, to substitute irrebuttable proofs 
in place of the hazy or wrong arguments of the XVII th and XVIII th centuries, 
to substitute completely clear concepts for the vague intuitions founded on 
deceptive geometric images - no one will ever draw the graph of a continuous 
nondifferentiable function -, in sum, as much as possible to “replace compu- 
tation by concepts” as, much later, Bourbaki would write in the Preface to 
his Elements of Mathematics . As we shall see at the beginning of the next 
chapter, one does not have to go very far to understand the need for this 
arithmetisation: what is the number 7r ? 

There are two fundamental aspects here. In the first place, to elaborate 
a systematic technique for manipulating approximations to allow one to give 
perfectly precise definitions of such concepts as convergence, continuity, dif- 
ferentiability etc. This started, still rather vaguely, with Cauchy in about 
1820, and was completed about 1870-1880 with Weierstrass and his usage of 
s and 5, Epsilontik as our German colleagues term it; vast, more or less ab- 
stract generalisations emerged in the XX th century, where the £ and S would 
be replaced by the concept of “neighbourhood” , but the ideas remained es- 
sentially his own, and his technique remained necessary and often sufficient 
in the immense majority of branches of analysis. As I shall use them from 
start to finish of this book (though my typist’s habit is to use r and r' where 
Weierstrass and all contemporary mathematicians use £ and (5), it is needless 
to say more here than, in essence, it consists of demonstrating equalities by 
replacing them by more and more precise inequalities: one shows that a — b 
by proving that \a — b\ < l/10 n for every integer n. This is the fundamental 
difference between analysis and arithmetic or algebra. 

The other aspect, which spread less easily because it constituted a major 
upset to the modes of thought of mathematicians, was the invention of Set 
Theory by Georg Cantor (1845-1918) between about 1870 and 1890: to a first 
approximation this consists of conceptualising that the totality of mathemat- 
ical objects possessing a given property forms, in itself, a new quite distinct 
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mathematical object, and of reasoning about such objects; the appearance of 
the “actual infinity” in mathematics, a concept which for centuries has set 
cohorts of metaphysicians and theologians cogitating, and often deliriously 
- Cantor knows and quotes them - and which therefore at the beginning 
provoked violent opposition from among the mathematicians 2 . Cantor and 
Dedekind, with whom he cooperated, were at first interested in more and 
more complicated sets of real numbers - here again it was the theory of 
trigonometric series which furnished Cantor with his initial motivation and 
examples, though the fields of algebraic numbers and Dedekind’s “ideals” are 
also sets of numbers, even if much simpler 3 , - then to sets which, in the eyes 
of many mathematicians of the period, were a matter for metaphysics rather 
than normal mathematics, a criticism which, at the beginning, was legiti- 
mated by the logical paradoxes which arose on applying modes of reasoning 
that rely on an abusive extension of ordinary language to them. One should 
note that before attempting the assault on “transfinite numbers” which, a 
century later, are rarely used in practice, Cantor introduced extraordinar- 
ily useful concepts into the study of sets of numbers or points - open and 
closed sets, “accumulation points” etc. - which do not seem to pose logical 
problems 4 , which many mathematicians then revealed in their most orthodox 
research, and which would complete the clarification of analysis undertaken 
by Weierstrass, of whom, moreover, Cantor had been a student. 

Then a period opens which sees the birth of mathematical logic, of the 
theory of “abstract” sets, and first attempts to axiomatise all of mathemat- 
ics, starting with the theory of integers, i.e. arithmetic. Founded by Gottlob 
Frege and pursued by Giuseppe Peano, Bertrand Russell and Alfred North 
Whitehead, Ernest Zermelo, David Hilbert, etc., to mention only authors 
known or famous before 1914, these new theories were intended to codify 
on the one hand the rules of construction of mathematical arguments (ele- 
mentary logical operations, use of “variables”, the formulation and calculus 
of “propositions”, etc.), and on the other hand the rules of construction of 
mathematical objects (numbers, functions, sets, etc.), and finally to isolate 

2 The opposition came not so much from the fact that the bizarre sets that Cantor 
constructed contained an infinite number of elements: no one objected to consid- 
ering a line, a plane, a curve, as an infinite collection of points, nor to considering, 
for example, the intersection of a plane and a surface. The major objection arose 
from the fact that Cantor sometimes constructed sets by an infinite number of 
intermediate constructions, with arbitrary choices at each stage, inexplicit, and 
even impossible to describe explicitly. Present day mathematicians do this every 
day, but it was not so around 1870-1880. 

3 In the ring of rational integers Z (i.e. of arbitrary sign) an ideal is a set I pos- 
sessing the property that ux + vy £ / for every x, y £ / and u, v £ Z. Such an 
ideal is the set of multiples of some integer. 

4 But fifty years it later would be discovered that one of the conjectures which 
Cantor and others endeavoured in vain to prove (the Continuum Hypothesis) 
is in fact unprovable and one can, at will, accept or reject it, as with Euclid’s 
Parallel Postulate. 
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the axioms starting from which one can erect the whole of mathematics in a 
perfectly coherent way, without, it is to be hoped, stumbling upon an internal 
contradiction. The rules imposed, while allowing all the standard mathemati- 
cal arguments and objects, are strict enough for the early paradoxes to be, one 
hopes, impossible to formulate without infringements; to wit, safety barriers 
beyond which one ventures at one’s own risk and peril. This domain, which 
has been the subject of a large number of works for a century, still presents 
very difficult problems and has sometimes given rise to intense debate; most 
mathematicians observe this from a distance, content with a “naive” , i.e. not 
strictly formalised version, of Logic and Set Theory. 

The principal result of this work is that every mathematical object can be 
considered as a set and indeed that the only logically correct way to define a 
mathematical object is to say that it is composed of one or more sets subject 
to explicitly stated conditions. This method suffers from the inconvenience 
of making the seemingly simplest mathematical objects, the real numbers for 
example, appear as extremely complex elaborations of sets; in the definition 
of the real numbers as “cuts”, due to Dedekind, and simplified by Peano 
and Russell, which we shall give at the beginning of the next chapter, a real 
number, for example 7r, is by definition a set of rational numbers (naively: the 
set of all rational numbers < 7r); a rational number, in its turn, is 5 the set of 
all pairs of integers (p, q ) of arbitrary sign such that x = p/q, q -=f 0; an integer 
n of arbitrary sign is itself the set of all pairs (p, q) of whole numbers such 
that p — q — n; a whole number, at last, is a set, the number 3 for example 
being a set of three elements chosen once and for all; thus the number i r 
becomes a set of sets of sets of sets, all infinite except for the last. One might 
therefore, starting from the whole numbers, establish a kind of hierarchy of 
complexity in the universe of mathematical objects, as do certain logicians. 
But of course from the moment these objects and the operations to which 
they may be subjected have been defined by applying the standard operations 
of set theory to objects already known, one forgets their explicit definitions, 
their extreme complexity making manipulating them quite impossible; one 
confines oneself to arguing as was always done, up to one detail: one knows 
exactly, or one could know, if one put one’s mind to it, what the symbol n 
signifies, as it leaves metaphysics and enters into mathematics 6 . 

Following this historic evolution one can contemplate Set Theory from two 
different points of view: on the one hand from the “naive” point of view of 

5 See for example §§5 and 28 of my Cours d’algebre (Hermann, 1966) or Algebra 
(Addison- Wesley. 1972) 

6 The physicist Emilio Segre explains in his memoirs that, when he was in high- 
school, he did not understand why his mathematics teacher stressed the need to 
define the real numbers by means of Dedekind sections, the concept of number 
seeming to be sui generis. This only proves that a Nobel Laureate in physics 
may not understand the mathematics he has been using during his whole life, or, 
if one prefers, not understand the difference between Physics and Mathematics. 
See the beginning of Chapter II. 
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mathematicians who use it daily while restricting themselves to the imagining 
of the simplest geometrical and physical images, on the other hand, from the 
“formalised” point of view of logicians who, by systematically applying rules 
of argument, and methods of construction stated once and for all, construct 
everything from nothing “to allow thought to rise above the void by leaning 
on the void” as wrote a physicist studying Newton’s cosmology 7 . We shall 
confine ourself to expounding an intermediate point of view very succinctly, 
without using logicians’ language, but nevertheless not letting the reader 
believe that intuitively “evident” results require no justification. The use of 
ordinary language can make propositions seem intuitive, though professional 
logicians will consider them by no means evident, and computers will not 
understand for the simple reason that they lack intuition 8 : to make oneself 
understood by a stupid though disciplined machine one must explain oneself 
in a perfectly correct fashion, even if one only wants to receive information 9 
on the Silicon Valley Paedophiles Brotherhood through the Internet. We shall 
give some very summary details on the language of logicians later. 



7 Loup Verlet, La malle de Newton (Gallimard, 1993), p. 292. Highly recommended 
reading. 

8 Although the invention of computer science has presented several inconveniences 
to mankind - ask ordinary Iraqis what they think of the computers on a cruise 
missile -, it offers a great advantage to mathematical pedagogy: a proof is totally 
correct if a suitably programmed computer could understand it. This point of 
view does not eject the role of intuition from mathematics; it simply shows that 
one must not confuse an intuition with an argument. 

9 For computer scientists, “information” is a sequence of the digits 0 and 1. It is 
interesting to note that before becoming a “cultural rubbish bin” , as the Chinese 
call it, the Net was first conceived as a military system with the purpose of 
assuring American telecommunications in the event of nuclear war, thanks to the 
total interconnection of the given bases and centres of calculation or command; 
it was soon linked to laboratories and university departments highly delighted 
to profit from it (and to profit the military through their expertise) after which 
the military provided themselves with a system for their exclusive use while the 
“civil” system passed into the control of the National Science Foundation, from 
which it has finally been removed. 

There is an enormous American literature on the history of computer sci- 
ence; the following titles, by historians who cite their sources, are particularly 
recommended: Kenneth Flamm, Creating the Computer: Government, Industry, 
and High Technology (Brookings Institution, 1988), Paul Edwards, The Closed 
World. Computers and the Politics of Discourse in Cold War America (MIT 
Press, 1996), Arthur L. Norberg and Judy O’Neill, Transforming Computing 
Technology. Information Processing for the Pentagon 1962-1986 (The Johns Hop- 
kins UP, 1996), very dry, Thomas P. Hughes, Rescuing Prometheus (Pantheon 
Books, 1998) which also treats other similar subjects (missiles, SAGE, systems 
analysis, etc.), Janet Abbate, Inventing the Internet (MIT Press, 1999), from a 
Ph.D. thesis, Kent Redmond and Thomas M. Smith, From Whirlwind to Mitre. 
The R& D Story of the SAGE Air Defense Computer (MIT Press, 2000), where 
one can see to what extent the development of the gigantic SAGE net for the air 
defence of the American continent influenced the progress of computer science 
and electronics in the 1950s. 
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§1. Set Theory 

I — Membership, equality, empty set 

The concept of a set 10 is a primitive concept in mathematics; one can no more 
provide a definition than Euclid could define mathematically what a point is. 
In my youth there were those who said that a set is “a collection of objects of 
the same nature” ; apart from the vicious circle (what indeed is a “collection” ? 
a set?), to talk of “nature” is empty and means nothing 11 . Certain denigrators 
of the introduction of “modern math” into elementary education have been 
scandalised to see that in some textbooks they have had the temerity to form 
the union of a set of apples with a set of pears; never mind that a normal 
child will tell you that this gives a set of fruits, or even of things, and if asked 
to count the number of elements of the union any moderately intelligent child 
can explain to you that it does not matter that the first set consists of apples 
rather than oranges and the second of pears rather than dessert spoons; the 
fact that the Louvre Museum combines disparate collections - of pictures, 
sculptures, ceramics, gold work, mummies, etc. - has never troubled anyone. 
One calls this: to acquire the sense of abstraction. 

The logicians have in any case long since invented a radical method of 
eliminating questions concerning the “nature” of mathematical objects or 
sets (the two terms are synonymous). One can describe this in a figurative 
way by saying that a set is a “primary” box containing “secondary” boxes, 
its elements , no two of which have identical contents, which in their turn 
contain “tertiary” boxes themselves containing . . . The Louvre is a collection 
of collections (of paintings, sculptures, etc.), the collection of paintings is 
itself a collection of paintings stolen by Bonaparte, Monge and Berthollet in 
Italy (we unfortunately had to return it in 1815), bequeathed by . . . private 
collectors, bought at sales, etc. 

The whole of set theory rests on two sorts of relations. The membership 
relation x G X which is read “x belongs to X” or “x is an element of X”; this 
means that x is one of the secondary boxes contained in the box X, while 
secondary means: not contained in any box other than X itself. The negation 
of x G X is denoted x ^ X. To express the fact that an object x is an element 
of a set which is itself an element of X, one might write r EG X; at the next 
level one might write x GGG X. These last two notations have not found 
common currency, but I will use them occasionally in this chapter. If one 
considers the Louvre as a set whose elements are its collection of paintings, 
its collection of sculpture etc., then the Mona Lisa GG Louvre. 

On the other hand there is the equality relation x — y, whose intuitive 
meaning is that the two sets are identical; its negation is written x ^ y. The 

10 After several tentatives Cantor chose the word Menge (quantity, number, 
amount, mass, multitude, crowd). 

II Cantor defined a set as “every assemblage as a whole (Zusammenfassung zu 
einem Ganzen) M of defined and distinct objects m of our intuition or thought” . 
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two relations E and = obey axioms which we shall state gradually as needed; 
they permit us to construct ever more complex sets and relations. 

The first, the axiom of extension , says that two sets A and B are equal 
(i.e., for the mathematician, identical or indistinguishable) if and only if they 
possess the same elements; in other words, if the relations x E A and x E B 
are logically equivalent. In the interpretation of a set X as a nesting of boxes 
it is thus pointless to place in the primary box X two secondary boxes A and 
B which, as nestings of boxes, have exactly the same structure: they are not 
mathematically distinct objects even if physically they appear to be so. If for 
example you place three empty boxes in a box X (or, more generally, three 
copies of the same box), you obtain the same set as if you had placed only 
one, since the axiom of extension shows that the two sets are mathematically 
identical. 

Given two sets X and Y one says that X is contained in Y (or that Y 
contains X, or that X is a subset of Y) if every element of X is an element 
of Y] the notation is X C Y or Y D X. It is clear that if X C Y and Y C Z, 
then X C Z. The relation X — Y means that both X C Y and Y C X. 

Some authors write X C Y to maintain the visual analogy with x < ?/, 
and reserve X C Y to mean that X is a proper subset of Y. This is a totally 
useless notation. 

If one considers, as has just been suggested, that mathematics consists 
essentially of proving theorems about more or less complex nestings of boxes, 
the simplest box one can imagine is the empty box. One thus needs a particu- 
lar mathematical object, denoted 0, the empty set ; its existence is the second 
of the axioms of set theory 12 : there exists a set 0 such that the relation x E 0 
is false for every x. 

One meets this set in everyday life. If, when you are travelling by car 
in the Far West, the police arrest you because you have shot a red light at 
the intersection of two flat, deserted, absolutely straight, orthogonal roads 
you might acknowledge that your infraction constituted a mortal danger to 
the, empty, set of motorists visible within a range of ten miles. (You would 
pay the same fine, even so.) On 12 August 1997, the atmospheric pollution 
in Paris having attained too high a level, the Parisian police, relayed by the 
media, generously announced that Paris residents parking their cars within 
their authorised perimeter were graciously absolved, the following day, from 
paying their daily tribute of 15 F for this right 13 ; but all the motorists who 

12 Some logicians prefer to postulate the existence of a set; that of 0 follows imme- 
diately from the axiom of separation stated below. 

13 The idea is to encourage Parisian motorists to travel to work by public transport. 
The fact is that normally using your car instead of parking in your street allows 
you to save 15 F for parking and two metro tickets, i.e. enough money to buy 
three litres of petrol. The duty of paying 15 F or a fine of 75 F if you do not use 
your car to get to work before 9 a.m. thus amounts to subsidising the polluters 
and penalising those who use public transport. This is confirmed by the fact 
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reside in Paris and park on the public roads know that the set of days of the 
month of August on which this tax is obligatory is empty, in contrast to the 
rest of the year. 

The most remarkable property of the empty set is that everything one can 
say about its elements (though not about itself) is simultaneously true and 
false, and that, further, no logical catastrophe ensues. When I once informed 
a friend that every man who has passed the age of five hundred years makes 
love three times a day she replied “that’s false, I’m sure you wouldn’t be able 
to” ; to this kind of typically feminine ad hominem logic - it is well known that 
women are incapable of reasoning impersonally, objectively and abstractly - 
I obviously replied “certainly, with you”; she then exclaimed “false, I’d be 
dead and I don’t like old men” and, to finish, threw a fit of nerves when I 
replied that that did not contradict the initial proposition one whit. To learn 
to juggle with the innumerable properties of the empty set is an excellent 
exercise for developing your powers of reasoning; you could in particular 
set yourself to detect all statements, including those in the present treatise, 
which, taken literally, are false because the author has forgotten to posit that 
a certain set is not empty: “every continuous function on a compact set attains 
its maximum at a point of this set”, “every bounded set has a strict upper 
bound” , etc.; these statements are false if the set under consideration is empty 
because they affirm the existence of an element (possessing certain properties) 
of the empty set. The perpetrators of such gross errors will generally reply 
that they have passed the age of priggishness and trust to the good sense of 
the reader, who is asked, implicitly, to trust the competence of the author. 

The empty set is a subset of every other set X : since 0 contains no elements 
at all, all its elements are also elements of X. It is also true that the empty 
set may be an element of another set X as we shall now see. 

It is on the empty set that one leans to “lift oneself up from the void”; 
figure 1 below represents a primary box X containing three secondary boxes 
A, B, C, which are the elements of X; A is the empty set and contains no 
elements, B is a box containing an empty box, so has one element, namely 
the empty box in question, while C is a box with two elements: an empty 
box and a box containing an empty box. 

The representation above might lead the reader to believe that there are 
four distinct empty sets in the schema for X; now there is only one empty 
set in Nature, but, like the Holy Ghost, it is everywhere simultaneously. One 
can finesse this niggle by replacing the imagery of boxes by the schema of 
the relation x G y\ on writing x — » y to facilitate the graphic representation, 
figure 1 would be replaced by the following schema: 

that parking is free between 7 in the evening and 9 in the morning as well as on 
Saturday and Sunday, i.e. outside working hours. One should teach the abc of 
formal logic (a few pages of Plato would suffice) to the bureaucrats who hope to 
fight pollution by taxing the non-polluters. Let us add that in certain countries 
the residents buy a permit at the beginning of each year, so freeing them from 
the daily racket which one suffers in Paris, and for a modest sum. 
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The significance of the signs { and } will be explained in the following n° . 

2 — The set defined by a relation. Intersections and unions 

In practice, a set is most often defined by a characteristic property of its 
elements: “the set of integers between 2 and 25”, “the set of points distant 
15 km from a given point O”, “the set of rational numbers < 7r”, “the set of 
functions / defined on R with values in R”, etc. A seemingly obvious general 
statement, due to Frege, is that for every proposition or relation P{x} in 
which there appears a variable x symbolising a totally undetermined object 
one can speak of the set of those x such that P{x} is true ; this set will be 
unique, by the axiom of extension. If you choose the relation x — x you 
will thus obtain the set of all mathematical objects, a creature which justly 
evoked great suspicion in Cantor; he spoke of it as a “class” of sets, a concept 
which the logicians later used and developed. 

In 1903, as Frege was on the point of publishing the second volume of his 
Grundgesetze der Arithmetic Bertrand Russell 14 demolished Frege’s whole 

14 On Bertrand Russell, an exceptional personality and formidable source of ideas 
of all sorts, see Ronald. W. Clark, The Life of Bertrand Russell (Penguin Books, 
1975). A much more scholarly biography is in course of publication, but the 979 
pages of Clark, of which 160 are notes, bibliography and index, already contain 
much information; sold at £2.95 when it appeared, one cannot ask for much more 
. . . Do not seek mathematical logic there: Clark is a “man of letters” , does not 




§1. Set Theory 11 



edifice, at least twenty years 5 work, and a part of his very self, by choosing 
x ^ x for the relation P{x}. Suppose that there is indeed a set A such that 

(2.1) for every x, x £ A is equivalent to x £ x. 

Since a relation which is true for every x remains so when one substitutes 
a specific mathematical object for the variable x, one sees that the relations 
A e A and A £ A are logically equivalent: contradiction! 

The axiom of separation (Ernest Zermelo, 1908) obviates “Russell’s Para- 
dox” : if P{x} is a proposition and X is a set one may speak of the set A of 
those x which belong to X and satisfy P{x}; in logical language 15 : 

(2.2) (x € A) (P{x} & (x G X)); 

instead of placing oneself in the absurd universe of all possible mathematical 
objects one places oneself in the specific set X: this is one of the guard-rails of 
the theory 16 . In particular, one may not speak of “the set of all sets”, as was 
done in Cantor’s time, for if such a set X existed the relation x ^ x would 
define, by (2), a set A C X satisfying (1), an absurdity. You may certainly 
think of the “class” , “category” , “totality” of sets, but this is not a set in the 
technical sense of the term. 

When X and Y are two sets one writes X — Y for the set of elements of X 
that do not belong to Y: the axiom of separation legitimates this definition: 
P{x} here is the relation x ^ Y. By far the most frequent case is that when 
Y C X ; X — Y is then the complement of Y in X; in this case one obviously 
has 

X — (X — Y) =Y. 

understand it, and makes no attempt to appear to understand it. This is a general 
problem in the history of science: when it is written by scientists who understand 
the subject the socio-political aspects disappear or reduce to non-documented 
banalities, and vice versa. The exceptions, for instance Loup Verlet’s book cited 
above, are very rare. Dieudonne resolved this dilemma by saying that the socio- 
political aspects do not explain the scientists’ ideas. Apart from the fact that 
this statement may be false (obvious counterexamples: the logarithms of Napier 
and Briggs for astronomers and navigators, Lavoisier and the French gunpowder 
administration, Gauss and geodesy, Liebig and nitrogenous fertilisers, Haber and 
the direct synthesis of ammonia, von Neumann after 1937, etc.), one is in the 
right not to be solely interested in mathematics or physics in the strict sense. 

15 The sign denotes logical equivalence; the sign & indicates the conjunction 
of two statements: the parentheses have the purpose of delimiting relations. See 
the part of this chapter that treats mathematical logic. 

16 The solution found by Zermelo trivially eliminates Russell’s Paradox since then 
one does not have the right to talk “in the air” of the set of x such that x £ x: 
one must specify at the outset that one places oneself in a given set But the true 
problem, resolved by Zermelo and his successors, was to show that on adopting 
their axioms one did not forbid anything commonly done in mathematics. The 
solution would have been rejected if, for example, it had made it impossible to 
construct the real numbers. 
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And X — X = 0 for any X. 

If X and Y are two sets their intersection X Pi Y is the set of objects 
belonging simultaneously to X and to Y; this definition is again legitimated 
by the axiom of separation applied to X and the relation x G Y. One says 
that X and Y are disjoint when X n Y = 0. 

The union A U B of two sets is, intuitively, the set of objects belonging 
to A or 17 to B. More generally, consider a set X and think of its elements 
as themselves being sets; the axiom of union affirms the existence of a set 
Y whose elements y are characterised by the following property: there exists 
an A E X such that y € A; logicians call this the union of X, an expression 
generally eschewed by mathematicians. It is the set of x such that x GG X. 
In the imagery of boxes this signifies that one may suppress the various 
secondary boxes belonging to X and replace them by the tertiary boxes they 
contain, eliminating double mentions as always. 

One ought to transform all this into an exciting game, one might even 
make a fortune by patenting it. Mr Gates’ employees would immediately de- 
vise a multicoloured speaking version (“now find the union of the union of 
the union”) for multimedia computers. There would be several degrees of dif- 
ficulty, characterised by the maximum number of nestings allowed: the BIB 
(Boxes in Boxes) for babies, the BIBIB (Boxes in Boxes in Boxes) etc., up to 
BIBIBI . . . (Boxes in Boxes in Boxes in . . . ) at level No- One could, thanks 
to the Internet, organise Olympiads on a planetary scale, as in mathematics. 
The parents of future students of the Polytechnique, of Harvard, or of the 
Todai’ University in Tokyo, could present BIBIBIB to their offspring at the 
age of six for the gifted, and three for the extra-gifted, the infinite ascensions 
in the BIBIBI . . . being reserved for the Mozarts of Logic. 

The axiom of union would allow one to legitimate logically the definition 
of A U B if one knew that there was a set C of which A and B are elements. 
The existence of C is as “evident” as is, in Euclidean geometry, the existence 
of a unique straight line joining two given points. As evident naively, as 
undemonstrable logically. We therefore need another axiom, the axiom of 
pairs : if A and B are two sets there exists a set C of which A and B are the 
only elements. C is unique, by the axiom of extension. One denotes it {A, P}, 
or {A} if A = B; you will have no trouble verifying that {A, B} = {B, A}. 
Given three sets A, B, C one puts {A, B , C } ={A, B} U {C}, etc. 

The operations of union and intersection have the following nearly obvious 
properties: 



X U Y = TUX, 



17 



In mathematics the conjunction “or” is not disjunctive: if P and Q are state- 
ments, “P or <5” does not exclude “P and Q” . 
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XU(YUZ) = (XUY)UZ, 
xn(Yuz) = (iny)u(inz), 

X-(YUZ) = (X-Y)D(X-Z), 

xnY = rni, 
xn(Ynz) = (xnY)nz, 
xu(Ynz) = (xuY)n(xuz), 
x-(Yr\Z) = (X -Y)U(X -z), 

Rather than learn these relations by heart one should be able to reconstruct 
them on the instant; once the notation has been understood these rules reduce 
to simple common sense. 

The axiom of separation eliminates Russell’s Paradox, yet, while the re- 
lation x x has a very normal look, the opposite relation, x E x, seems 
very strange: one has never seen a museum of painting which is an ele- 
ment of its own collection of pictures, and one would be hard put to it 
to realise x E x in the Game of Boxes of Boxes; it would be even more 
strange to consider sets x, y and z such that at the same time x E y, y E z 
and z E x: as Suppes more or less said “if you do not believe that this is 
against intuition then try to find an example”; one might add a la Serge 
Lang: if you succeed you will instantly be world famous among mathemati- 
cians because you will have demolished a theory painstakingly constructed 
over a century by excellent or very great mathematicians. The relations 
which might seem to permit this have been eliminated by means of the 
axiom of regularity or foundation ( Fundierung in German) formulated by 
von Neumann in 1925 and simplified by Zermelo in 1930: it says that if one 
considers the elements of a nonempty set A as themselves being sets then 
there exists a set X E A such that X n A = 0; we shall use this in n° 9, 
but one has no occasion to use it in practical mathematics. To deduce the 
impossibility of a relation such as x E y E z E x one applies this axiom to 
the set of three elements A = {x,y,z} and deduces a contradiction since 
then 

x E Any, y E An z, z E Af~ ) x, 

so that the intersections of A with its elements are all nonempty. One can 
also deduce the impossibility of an unending descending chain such that 
X\ 3 X 2 3 X 3 3 . . . ; such a relation contradicts the axiom of regularity 
for the set A = {xi,X 2 ,X 3 , . . . }, since, for every p, one has x p+ ± E x p Pi A 
and thus A Pi x ^ 0 for every x E A. A descending chain of membership 
relations thus always leads to the empty set if one pursues it far enough. 

On the other hand there are unending ascending chains, for example 

0 G 1 G 2 G 3... 
as we shall see in the next n°. 

3 — Whole numbers. Infinite sets 

Applied to the innocent empty set the formation of pairs leads to nonempty 
sets, not yet logically guaranteed to exist at this stage of the work: in the 
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first place {0}, the set whose only element is the empty set (box containing 
an empty box), then {{0}}, the set whose only element is the set whose only 
element is the empty set (box containing a box containing an empty box) etc. 
The relation 0 E {{{ 0 }}} is false; the correct relation is 0 EEE {{{ 0 }}}* the 
empty set belongs to a set which belongs to a set which belongs to {{{0}}}* 
These sets are pairwise distinct: the relation {{ 0 }} = {{{{ 0 }}}} for example 
would imply {0} = {{{0}}} by the axiom of extension, then 0 = {{0}} for the 
same reason, then { 0 } E 0 , which is false. An empty box contains nothing, 
not even an empty box. 

This type of construction furnishes a possible definition of the primary 
objects studied in mathematics, namely the whole numbers or natural inte- 
gers. According to Zermelo, 1908, and in conformity with the programme of 
reducing everything to set theory, one defines them by 

( 3 . 1 ) 0 = 0 , 1 = { 0 }, 2 = {{ 0 }}, • • • ; 

simple abbreviations 18 for very particular sets. A computer would under- 
stand, but it would take twenty seconds for a machine running at 100 Mhz to 
write or read the number 10 9 , assuming that one cycle is enough to recognise 
the signs { and }. 

Another method of defining the integers, equivalent 19 to the preceding, 
due to von Neumann 20 , consists of putting 

18 In particular the sign = used in these definitions is not that of set theory; when 
introducing a definition the logicians (and now some mathematicians) prefer to 
use the sign :=, the sign : warning the reader that one is introducing a definition 
or new notation, and not a relation to be proved. 

19 The precise mode of definition of the integers (or of any other mathematical 
object) is of no importance so long as the various possible definitions lead to the 
same theorems; the “nature” of mathematical objects is irrelevant because one 
only asks them to be models of real objects. For the same reason the symbol 
used by computer scientists to designate the number 15 is unimportant on the 
theoretical plane, so long as the computers are programmed to recognise it. 

20 1923; he was twenty years old and in the course of spending a few years learning 
chemistry in Zurich because his father, a banker in Budapest, wanted to direct 
him to a profession more lucrative than mathematics; he had already read Cantor 
and Co. at least three years earlier. This definition of the integers, used by N. 
Bourbaki, made certain French users of mathematics laugh at a certain period. A 
list, even though incomplete, of von Neumann’s activities, from 1937 - classical 
explosives, operational research and game theory, the A-bomb, the H-bomb, 
intercontinental missiles, computers - should reassure the philistines of his sense 
of the concrete. But he was not intellectually Cynical. Laurent Schwartz, Un 
mathematicien aux prises avec le siecle (Odile Jacob, 1997), p. 288, translated 
as A Mathematician Grappling with his Century (Birkhauser, 2001) writes about 
me that I have “never pardoned von Neumann for having forsaken mathematics 
to create computer science” ; it is true that thanks to a colleague from Princeton 
we knew vaguely in 1947, at Nancy, that von Neumann “was now doing numerical 
calculations” though we were ignorant of their purpose; but, at least so far as 
I am concerned, we know a lot more half a century later . . . On von Neumann 
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(3.2) 0 = 0, 1 = {0} = {0}, 2 = {0,{0}} = {O,1}, 

3 = {0,{0},{0,{0}}} = {O,1,2}, 

4 = {0, {0}, {0, {0}}, {0, {0}, {0, {0}}}} = {0, 1, 2, 3}, 

etc. For example, 3 is the set whose elements are (a) the empty set, (b) the 
set whose unique element is the empty set, (c) the set whose only elements 
are the empty set and the set whose unique element is the empty set (fig. 1). 
Thus 0 G 4, 0 GG 4, 0 GGG 4 and 0 GGGG 4. Figure 2 shows the two possible 
definitions of the whole number 4 in the imagery of boxes. 




fig. 2. a) 4 according to Zermelo; b) 4 according to von Neumann 



Given a set x the logicians call the set s(x) = xU {x} the successor of x; 
we have s(x) / x since x £ x. One thus obtains the integers by applying this 
operation repeatedly to the empty set: 0 is the empty set, 1 is the successor of 
0, 2 the successor of 1, etc. In other words, if 14 is a primary box containing 
fourteen secondary boxes, then 15 is the primary box containing identical 
copies of the fourteen secondary boxes contained in 14 and the box 14 itself, 
which is not identical with any of the fourteen boxes it contains. In what 
has just been written “fourteen” is what is known to everyone who knows to 
read, write and count, while 14 is the mathematical or logical number defined 
by the method of Zermelo or von Neumann; a computer understands 14 but 
not fourteen; for humans it is usually the opposite. 

Von Neumann’s definition seems much more complicated than Zermelo’s: 
to write the number 10 9 explicitly requires 2 1 000 000 000 parentheses, an integer 
with about three hundred million digits, and 2 999 999 999 mentions of 0. But 
in conformity with intuition it defines 14 as a set of fourteen elements. It 

and computer science, see William Aspray, John von Neumann and the Origins 
of Modern Computing (MIT Press, 1990). 
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presents another advantage, because of which it has been adopted almost 
universally for the construction of the “infinite ordinals” of Cantor, as we 
shall show in n° 9. 

The question of how to know whether the sets used to define the integers 
are themselves elements of some set, in other words, that of the existence of 
the set N of whole numbers , is not logically evident; it would be, by the axiom 
of separation, if one already knew of the existence of a set of which at least 
all the integers are elements, but where to find one? All the sets we have 
actually constructed up to now, starting from the one set whose existence 
is guaranteed a priori , namely 0, have a finite number of elements, while, 
according to all the evidence, N must possess infinitely many, whatever the 
precise definition of this term. To justify its existence one thus introduces the 
axiom of infinity : one possible formulation of this is to posit the existence of 
a set A such that 

(3.3) (0 6 X) and (x e X implies s(x) E X ), 

as would clearly be true of N if one already knew that N existed; this is 
not the case of the sets 0, 1, 2 etc. defined above: we have 2 G 3 but the 
relation s (2) E 3 is false. A set satisfying the conditions (3) is sometimes 
called inductive . Let us show how to construct N starting from here. 

First, it is clear that every union or intersection of inductive sets is in- 
ductive. If, in any inductive set A, one considers the intersection Ao of all 
the inductive sets X' C X one thus obtains the “smallest” inductive set con- 
tained in X. If T is another inductive set then X n T is an inductive subset 
of both X and of T, so contains Xq and To; To is thus an inductive subset 
of A, whence To D Ao and vice versa ; in other words Ao = To. The set Ao 
defined in each inductive set A is thus the same independently of A; it is, by 
definition, the set N of whole numbers, and, similarly, the smallest of “all” 
inductive sets (which are too numerous to be the elements of a set). 

Since N is inductive and contains 0 it contains all the whole numbers a la 
von Neumann. This proves that they belong to a common set, and therefore 
we may speak of the set E of these integers. It is clear that it is inductive 
and is contained in N. Hence E — N, so the elements of N are precisely the 
integers a la von Neumann. 

From this follows the principle of proof by induction : to show that a prop- 
erty P{n}, in which the letter n symbolises an indeterminate “variable”, is 
true for every n E N one shows that 

(i) it is true for n = 0, i.e., in ordinary language, for n = 0, 

(ii) the relation P{n} implies P{s(n)} i.e., in ordinary language, that P{n} 
implies P{n + 1}. 

If this is so, then the set of n E N satisfying P{n} is inductive and contained 
in N, so equal to N. 
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4 — Ordered pairs, Cartesian products, sets of subsets 

If a and b are mathematical objects one has {a, b} = {&, a}. If, on the other 
hand, you associate to every point of a plane furnished with coordinate axes 
its two coordinates x and y and denote the corresponding point by the classi- 
cal notation (x, y), it is clear that in general (x,y) ^ (y,x). One is therefore 
led to associate with two objects x and y written in a determinate order a 
new object (x,y), an ordered pair , (or, in French, couple) the rule of equality 
for two ordered pairs being that 

(4.1) (x, y) = (u, v) if and only if x = u and y = v. 

Sometimes one says that x and y are the projections or coordinates of the 
ordered pair (x,y). Similarly one defines triplets 

(x,y,z) = ((x,y),z), 

quadruplets 

(x,y,z,t) = (( x,y,z),t ), 

etc. The rules of equality for such objects are obvious. 

The axiom of pairs, used in n° 2 to define doubletons (non-ordered pairs) 
{x,y}, allows one to introduce ordered pairs without stepping outside the 
framework of Set Theory, by agreeing, for example, that 

(4.2) {x,y) = {{x},{x,y}}, 

an astute idea due to the Pole Kasimierz Kuratowski (1921). Suppes tells us it 
was already to be found in another form in 1914 in the work of the American 
Norbert Wiener, the “Father of Cybernetics” , as journalists knowing nothing 
else of him have called him since 1950. The relation (x,y) = (w, u), i.e. 

(4.3) {{x},{x,y}} = {{u},{u,w}}, 

in fact forces either {u} = {x}, or {w} = {x, y}. In the first case one has 
u — x; in the second, x = y = u; so u = x in either case. If x = y, one 
has {x,y} = {x}, hence {{x},{x,y}} = {{x}} so that (3) can be written 
{{x}} = {{x}, {x,u}}, which implies {x,v} = {x} i.e. v — x — y\ if x y, 
the second member of (3), which can be written {{x},{x,u}} since u = x, 
cannot contain the element {x,y} ^ {x} unless {x,u} = {x,y}, and since 
x 7 ^ y this forces v = y. In conclusion, one sees in every case that the 
condition ( 1 ) is satisfied by the definition ( 2 ) of ordered pairs. 

Exercise . Draw the boxes or graphs that represent the sets ( 2 , 3) and (3, 2 ). 

Given two sets X and Y the set of ordered pairs (x, y) for which x £ X 
and y £ Y is called the cartesian product of X and Y, and is denoted X xY. 
More generally one defines X x Y x Z = (X x Y) x Z, the set of triplets 
(x, y, z) with x£X,y£Z,z£Z , etc. If X is a set one defines 
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X 3 = X x X X X, etc. 

The existence of the cartesian product is plain from the naive point of view: 
for the logicians it demands a proof. Now the elements z of X x Y are char- 
acterised by the following relation P{z }: there exist an x G X and a y EY 
such that z — {{x}, {x, y}}. The existence of 1x7 can therefore be deduced 
from the axiom of separation so long as one knows in advance that those z 
satisfying P{z} belong to a common set Z\ but that precisely is the whole 
problem. 

Instead of just postulating the existence of 1x7 axiomatically the lo- 
gicians go much further. In the formula 2 = {{x}, {x, ?/}}, z is a set whose 
elements {x} and {x, y} are subsets of the union X U T. The axiom which 
allows one to resolve the problem affirms in a general way that, for every set 
X, there exists a set P(X) whose elements are the subsets of X: 

(4.4) Y G V{X) 

If, for example, X = {a, 6, c} where a, 6, c are pairwise distinct, V{X) has 
the elements 



0, W, {&}> M- {b,c}, {a,c}, {a, b}, {a,b,c}. 

Returning to the cartesian product of X and Y, its elements z are, after 
definition (2) of Kuratowski, sets whose two elements are subsets of X U Y, 
so elements of V{X U7); one thus has z d T(X \JY) and so, by definition, 
z G P(V(X U7)). The definition (2) of ordered pairs thus provides (axiom 
of separation) a set 

X xY CP(P(X{JY)). 

This argument (which you may well forget once you have understood it: the 
sole thing to retain is the condition for two ordered pairs to be equal) may 
appear somewhat esoteric and abstract, but it has the merit of showing that 
the product 1x7 can be constructed by means of the standard operations 
of Set Theory, and for logicians, further, that it is logically founded, and ob- 
viates the risk of internal contradictions of the type of the Russell Paradox. 
One has to appreciate that Logicians are an even more bizarre lot than Math- 
ematicians: they feel the need to prove everything, including what is plain to 
see to the “general public” . In matters electronical, these are content to press 
the buttons on the black boxes and check that “it works”; the professionals 
seek to understand what happens inside them. 

The construction of V{X) for every set X is useful in many other circum- 
stances. The definition of the real numbers proposed by Dedekind amounts 
to saying, as we shall see at the beginning of Chapter II, that a real number 
x is a set of rational numbers (intuitively, the set of £ G Q such that £ < x). 
It is thus an element of V(Q) so, thanks to the axiom of separation one can 
speak of the set R C V(Q) of real numbers. After having constructed the real 
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numbers as subsets of V(Q) and proved their fundamental properties one of 
course forgets their manifestly complicated definition. 

One also uses V in the study of equivalence relations. Such is any relation 
R (denoted for example by xRy) between the elements of a set X satisfying 
the following conditions: (i) xRy and yRz imply xRz , (ii) xRy implies yRx , 
(iii) xRx is true for all x G X] consider for instance the relation “x — y is 
a multiple of 3” between signed integers. If a G X the axiom of separation 
allows one to speak of the set C(a) of all x G X such that xRa is true; this is 
called the equivalence class of a. It is immediate that a G C(a) and that two 
classes C(a) and C(b) are either identical or disjoint. The axiom of the set 
of subsets allows us to consider the C(a), a G X, as the elements of a new 
set, contained in V(X ), denoted X/R, and called the quotient of X by the 
equivalence relation R. See for example the short §4 of my Algebra , where 
you will find examples and also applications to the construction of the signed 
integers and of the rational numbers. Observe in passing that, by explicit 
construction , and not just from the general abstract theory, a quotient set is 
a set of sets. 

5 — Functions, maps, correspondences 

The concept of the cartesian product allows one to introduce the general 
concept of a function or map, which is as fundamental as that of a set and 
which, as we shall see, reduces to it as do all others. In elementary education 
and in the whole history of mathematics up to the beginning of the XIX th 
century, a function was given by a “formula” such as /(x) — x 2 — 3, f(x) = 
sinx, etc., but starting with Descartes one often also defined a function from a 
curve whose “equation” one sought. For experimental scientists and engineers 
a function is very often also given by its graph , the geometrical locus of those 
points (x,p) in the plane such that y = /(x) for a function / which, quite 
often, one does not really know. 

Starting with the XIX th century the concept of a function ceased to be 
associated with a simple or complicated “formula” ; the German Dirichlet for 
example speaks of the function equal to 0 if x is a rational number and to 1 
if x is irrational, and one later envisaged much stranger functions, until the 
general and abstract concept emerged of a function defined on a set X and 
having values in a set Y ; such a function / associates to every x G X a well 
determined y — f(x) G Y depending on x according to a precise rule. The 
graph of / is then the set of ordered pairs (x, y) G X x7 such that y = /(x) 
for every x G X. One encounters this in everyday life: if, in a monogamous 
society, one denotes by H the set of married men and by F the set of women, 
the relation “y is the wife of x” is a function with values in F defined on H. 
Its graph is clearly a set of . . . couples. 

Conversely, a subset G of X x Y is the graph of a function / provided that 
G has the following property: for every x G X there exists one, and only one, 
y eY such that (x, y) G G ; and then one writes y = /(x). This convention 
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allows one to reduce the concept of a function to that of a set: by definition 
a function defined on X with values in Y is & subset of 1x7 subject to the 
preceding condition; no longer is there a “formula” . 

On suppressing the restriction imposed on G one obtains the concept of a 
correspondence or relation between X and Y: two elements x E X and y gF 
correspond under Gii(x,y) E G . If one returns to the preceding example and 
replaces the set H by the set of all men, and does not assume that the society 
is monogamous, the relation “y is one of the wives of x” is a correspondence 
between H and F. One does not insist on the existence for every x E H of 
a y £ F such that (x, y) E G, nor does one insist that such a y be unique; 
the x for which there exists such a y constitute the set of definition of the 
correspondence (the owners of a harem); the y such that one has (x,y) E G 
for at least one x constitute the image or the set of values (the women of a 
harem). If X = Y = R, the relation x 2 + 3 y 2 = 1, whose graph G in the plane 
is an ellipse, is a correspondence: its set of definition is the set of x such that 
|x| < 1, its image the set of y such that \y\ < l/\/3; this correspondence is 
not a function since a real number may have two distinct square roots. The 
formula x < y is likewise a correspondence (one more often says “relation” 
in cases of this sort) whose graph the reader will have no trouble finding. 

In actual practice one often uses other expressions. Instead of saying 

let / be a function defined on X with values in Y, 
one often says 

let / be a map of X into Y 
or 

consider a map / : X — ► Y. 

When / is given by a “formula” one also, for example, speaks of 
the map x i — > x 3 of X into T, 

assuming that this makes sense; do not confuse the signs — > and i — ►; the 
string x i — > x 3 does not denote a map of the set x into the set x 3 , it denotes 
the function or map which to each element x of X associates the element x 3 
of Y. 

Let us again observe that in mathematics, when speaking of a function 
or map /, one must specify the set X on which / is defined and the set Y 
in which it takes its values. To speak without further specification of “the 
function x 2 ” is meaningless 21 . The map x i — » x 2 of the interval 0 < x < 1 

21 The logicians nevertheless speak of functional relations without specifying the 
sets of departure or arrival: they mean a relation R{x, y} between two “variables” 
x and y such that 



R{x,y} & R{x,y"} implies y = y " . 
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of R (the set of real numbers) into the set R is not the same as the map 
x i — > x 2 of R into R; their graphs are different. Moreover, they do not have 
the same properties: in the first case the equation f(x) — 6, for a given b E R 
has one or no solution, while it can have two in the second case. Neglecting 
these “details” leads to confusion and errors of reasoning. 

If A is a subset of a set X, the characteristic function of A (relative to 
X) is the map xa • X — ► {0, 1} given by 

Xa(x) = 1 if x G A, = 0 if x ^ A. 

If X — R, one can sketch its graph easily if A is, for example, the union of 
a finite number of pairwise disjoint intervals - it consists of horizontal line 
segments, with “jumps” at the extremities of these intervals - but you will 
not manage if A = Q, the case Dirichlet had spoken of already in about 1830. 
The principal interest of these functions is to transform relations between sets 
into relations between functions, for example: 

XAnB(x) = Xa(x)xb(x), 

Xaub(x) = xa(x) + Xb{x) - Xa{x)xb{x ), 

Xx-a(x) = 1-Xa{x), 



etc. 



Instead of speaking of functions one often speaks in mathematics of fam- 
ilies of numbers, sets, etc. The only difference between these concepts relates 
to the notation employed: given two sets I and X a family of elements of X 
indexed by /, the notation is 

( x i)iel ’ 

consists of associating an Xi G X to each index i € /; the preceding notation 
is thus just another way of speaking of the map i i — > Xi of I into X, i.e. of 
the map / : I — > X given by 

f(i)=Xi for every i G J. 

One might do entirely without this concept, whose historical origin lies in 
sequences of real numbers 



^1 5 ^2 1 • * * 5 5 • • • 



which we will meet from the beginning of the next chapter, for example the 
sequence 



1, 1/2, . . . , 1/n, . . . ; 



for classical analysts, who concerned themselves only with functions where 
the variable can take all the real values in an interval, the notation u n denotes 
term number n in the sequence; but you can, without the least inconvenience, 
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and sometimes to advantage, as we shall see, write u(ri) for what is usually 
written u n and declare that a sequence of real numbers is nothing other than 
a map of the integers > 0 into the set of real numbers. 

When the terms of a family (Xi) ieI are considered as sets 22 one can define 
the union and intersection 



[jXi , f]X i 

iei iei 

of the family: in abbreviated form (J X*, p|X^; this is the set of x such that 
x € Xi for at least one i e I in the first case, for every iei in the second. 
One recovers the concepts introduced above by choosing I to be a set of two 
elements. In the general case there are formulae similar to those we have 
already mentioned in this particular case: if for example I is itself the union 
of a family {Ik) keK se ^ s > then 

u*=u(u*) & n*=n(n*<) 

iei keK \ieik / iei keK \iei k / 

( associativity of the intersection or of the union): if one wanted to regroup 
in one hypermuseum all the pictures belonging to all the various European 
museums one might begin by uniting all the museums of each country, after 
which one could regroup the national supermuseums so formed; in this exam- 
ple K is the set of European states and, for each fc, Ik is the set of museums 
in country k. All formulae of this type reduce to common sense despite their 
abstract and rebarbative appearance. 

Given a set X, a subset A of X, and a family ( Ei) ieI of subsets of X, 
the Ei are said to cover A, or to be a covering of A, when A C |Ji^. For 
example, the family of intervals (n — 1, n + 1), where n is an integer varying 
between 0 and p, covers the interval (0,p) (in R). 

The concept of the union of a family of sets arises notably in the construc- 
tion of a function by pasting. Consider, for example, a function whose graph 
is piecewise linear on the interval (0, 1); it is not given by a unique “formula” 
valid everywhere; on certain intervals it might be the function y = 2x -f 5, on 
others y = — x — 1, etc. More generally, suppose that a set X is the union of 
a family of sets ( Xi) ieI and that for each iei we are given a fi : Xi — > Y ; 
does there exist a function / on X such that / coincides with fi on each Xi ? 
An obvious necessary condition for the existence of / is that if an x e X be- 
longs to two sets in the family then the values at x of the two corresponding 
functions must be equal: 

22 This precision may seem superfluous since all the objects one studies in math- 
ematics are sets. But, in practice, it can happen that (quite consciously) one 
forgets this point, and it happens as often that one keeps it present in mind. All 
depends on context. Simple example: I is the set of points of a circle and Xi is 
the line (set of points in the plane) tangent to the circle at the point iei. 
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fi(x) = fj(x) for every x G Xi fl Xj. 

Conversely, if this compatibility condition is satisfied then the function / 
does exist: to define f(x) at an x G X one chooses an i G 1 such that 
x G Xi arbitrarily, and then puts f(x) — fi(x ); one has done the necessary 
to eliminate any ambiguity in the definition of f(x). Further, the graph G of 
/ is the union of the graphs Gi C Xi x Y C X x Y of the fa. 

The situation is particularly simple if one has a partition of X , i.e. a 
family of pairwise disjoint sets (Xi) ieI whose union is X. For every x G X, 
there then exists one and only one i G I such that x £ Xi, so that one may 
choose the fa arbitrarily. If, on R, for each integer n of arbitrary sign you 
have a function f n defined for n < x < n -f 1 (do not confuse the signs < 
and < , see Chap. II, n° 2), then there exists a function / defined on R which 
agrees with f n for n < x < n -f 1. If, on the other hand, the f n are given for 
n < x < n + 1, then / exists only if f n (n + 1) = f n +i(n + 1) for every n. 

The concept of a family of sets is linked to the axiom of choice : given a 
family of nonempty subsets (Xi) ieI of a set X there exists a map / : I — > X 
such that f(i) G Xi for every i G I . Intuitively, one obtains / by “choosing 
arbitrarily” an element Xi from each X i. Cantor and others used it implicitly 
until it was identified explicitly (Zermelo, Whitehead and Russell). As we said 
at the beginning of this chapter, many mathematicians objected to “infinities 
of random choices” with no precise mathematical sense that could never lead 
to “explicit” formulae. No matter that it has survived by virtue of its use 
in all sorts of branches of mathematics, where, most of the time, one uses it 
without even a mention. Moreover, it was later proved (Paul Cohen, 1963) 
that the axiom of choice is logically independent of the other axioms of set 
theory: that if they are themselves not inconsistent, as one hopes (though this 
has never been proved), then adjoining the axiom of choice will not lead to a 
contradiction. You can adopt it or reject it. What is more, there are branches 
of mathematics - arithmetic, for example - which can be constructed without 
using it. 

The axiom of choice comes in when one tries to extend the concept of 
the cartesian product to an arbitrary family (Xi) ieI of sets. Their cartesian 
product, properly, can only be the set of families (xi) ieI such that Xi G Xi for 
every i G I. The axiom of choice amounts to saying that a cartesian product 
of nonempty sets is always nonempty. 

6 — Injections, surjections, bijections 

Let us return to maps in general. Given three sets X, Y, Z and maps 
/ : X — > Y and g : Y — > Z, one can construct the composed map 
h : X — ► Z by putting 



h(x) = g[f(x)] for every x G X. 
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If F C X xY and G C Y x Z are the graphs of / and g , the graph H C X xZ 
of h is the set of ordered pairs (x,z) having the following property: there exists 
a, y e Y such that both (x, y) G F and (y, z) G G. It is immediate that H is 
a graph. The composed map h is denoted go f: thus 

(6-i) (g° f)(x) = g[f(x)\, 

but in fact one always writes g o f(x) instead of (g o f)(x). This concept gen- 
eralises what one does when speaking of the function sin cos x - one ought to 
write sin o cos - or, in geometry, when defining the “product” of two homo- 
theties, translations, etc. 

Given a map / : X — > Y, one frequently has to consider those x G X 
such that f(x) = b is a given element of Y. As regards their existence, all 
cases are clearly possible. The simplest is that where the equation f(x) = b 
has at least one solution no matter what b G Y ; / is then said to be surjective 
(or is a surjection) . The map x i — > x 3 of R into R is surjective, since every 
real number, whatever its sign, has a cube root. The map x \ — > x 2 is not, 
because only positive numbers have square roots. 

More generally, if one replaces X by one of its subsets A one is led to 
introduce the set B C Y of y such that the equation f(x) — b has at least 
one solution in A (and possibly elsewhere 23 ). This set, denoted by f(A), is 
called the image of A by f, a concept familiar from elementary geometry: the 
image of a circle by a translation is a circle. Clearly 

(6.2) f(AuB)=f(A)Uf(B), 
but 

the relation f(A P) B) = f(A) n f(B) is false 

in general, since for b G f(A) D f(B) the equation f(x) = b has at least one 
solution in A and at least one solution in B , but why should they be the 
same? 

The situation is simpler if / is injective or is an injection , i.e. if the equa- 
tion f(x) = b has at most one solution for every b G Y. In this case, it is 
clear, one always has f(A Pi B) = f(A) Pi f(B). 

Together with the concept of image - one also says the direct image- we 
have, in the inverse sense, the inverse image under / of a subset B of Y : this 
is the set of x G X such that f{x) G B\ the notation is f~ 1 (B). This time 
one has 

( 6 . 3 ) r\B'\jB") = r\B')ur\B"), 
r\B'nB") = 

23 In mathematics one says what one says and does not say what one does not 
say. Observing this rule in public or private life might eliminate a great number 
of stupid discussions of the type “You say that the French are racists. Do you 
believe that the Americans are not?” 
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in the first case one has to consider the x G X such that f(x) G B' U B" , i.e. 
such that f(x) G B f or f(x) G B”\ in the second, those x such that f(x) G B f 
and f(x) G B" , whence the formulae, which you can extend to the case of 
unions and intersections of arbitrary families of sets. 

It can happen that a map / : X — > Y is simultaneously injective and 
surjective; one then says that / is bijective or is a bijection ; this means that 
for every b eY the equation f(x) = b has one and only one solution x G X. 
The map x i — ► x 3 of R into R is bijective. The map x i — > x + 1 of N into N 
is injective but not surjective; it is a bijection of 

N = {0,1, 2, 3,...} onto {1,2, 3, 4, . . .} = N — {0}. 

The map x i — ► x 2 of R into R is neither injective nor surjective; it becomes 
bijective if one replaces R by R + , the set of real numbers > 0. 

When a map / : X — > Y is bijective one can define the inverse map 
/ -1 : Y — > X as follows: its graph G C Y x X is the set of ordered pairs 
(y>x) such that (x,y) G F, the graph of /. Since one and only one x G X 
corresponds to each y eY, G really is a graph and one sees that 

(6.4) x = / _1 (y) <=> y = f(x). 

It comes to the same to say that 

/ _1 o f(x) = x for every x G X and / o f~ 1 (y) — y for every y eY. 

If one writes idx for the identity map x \ — > x of X into X then 
f~ l o f = id x , / ° f~ 1 = id Y - 

For example, the inverse map of x i — ► x 3 on R is x i — > x 1 / 3 , the cube root 
of x. 

It is clear that if one composes maps which are all injective, or all surjec- 
tive, or all bijective, one obtains a map of the same kind: to solve g(f(x)) = c, 
one has to find a b such that c — g(b ), then an x such that b = f(x), whence 
the results. 

7 — Equipotent sets. Countable sets 

It is “obvious” that if X is a finite set - a concept which we have not yet 
defined strictly - and / is an injective map of X into a set Y then the image 
f(X) has as many elements as X. If, in particular, Y = X, then / cannot be 
injective unless it is surjective too; this is the property which Dedekind used 
to define a finite set ; the others being said to be infinite. (There are other 
ways of proceeding, as we shall see.) The example of the map n \ — > n 4- 1 of 
N into N shows that N is infinite in Dedekind’s sense. 

When there is a bijection of a set X onto a set Y one says that X and Y 
are equipotent (or have the same power , which assumes that we have defined 
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the difficult concept of the “power” of a set, which generalises that of “number 
of elements”; see n° 9); since the composition of two bijections is a bijection 
it is clear that if X is equipotent to Y and Y is equipotent to Z then X is 
equipotent to Z. 

The concept of equipotence is familiar as applied to finite sets: it is the 
foundation of the naive definition of the whole numbers. Its extension to 
infinite sets was Cantor’s first great idea. Seen from 1997 it does not have 
a very revolutionary air, but when Cantor proved that the set N of whole 
numbers is equipotent to the set Q of rational numbers he created a sensation: 
there were no more rational numbers p/q than integers, and yet there is an 
infinity of rationals between 0 and 1 , then between 1 and 2 , etc.? 

Further, Cantor’s proof was accessible to anybody. Every rational number 
can be written in a unique way in the form p/q with p and q having no 
common divisor (i.e. are relatively prime) and q > 0 . One can then group the 
rational numbers according to the value of \p\ + q; since q is positive there are 
only finitely many numbers for which \p\ + q has a given value s. You write 
on a line of infinite length the numbers for which 5 = 0 (there are none) , then 
those for which 5 = 1, etc.; you obtain 

0/1, 1/1, -1/1, 1/2, -1/2, 2/1, -2/1, 1/3, -1/3, 3/1, -3/1, 

4/1, -4/1, 3/2, -3/2, 2/3, -2/3, 1/4, -1/4, 5/1, etc. 

In this way one can assign to every irreducible fraction x — p/q an integer 
n = /(#), its rank in the above order; hence we have a bijection from Q onto 
N after agreeing to assign rank zero to 0/1 = 0 , qed. 

A similar argument will show the existence of bijections of N onto NxN, 
onto N x N x N [group the elements (p,q,r) of N x N x N according to the 
value of p + q + r], etc., or onto Q x Q, Q x Q x Q, etc. 

If there is a bijection of N onto a set X one says that X is countable. Our 
convention, in contrast to that of some other authors, is that finite sets are 
not reckoned as countable, but in fact we shall often say “countable” when 
actually meaning “finite or countable” . 

There are several useful theorems on countable sets; we shall confine our- 
selves to giving semi-naive proofs of them. 

( 1 ) Every subset Y of a countable set X is finite or countable. To see this 

naively one writes the elements of X as a sequence suppresses 

those x Y, and reenumerates the remaining elements, i.e. those of Y. 

(2) The image Y = f(X) of a countable set X under a map f is finite 
or countable. We enumerate the elements of X as we have just done and put 
y' n — f(x n )] one thus obtains all the elements of Y, in general several or even 
infinitely many times. To write the elements of Y once and once only one 
proceeds as follows: put yi = then let y 2 be the first term of the sequence 
of y' n which is 7 ^ y\ , then ys the first term of the sequence y' n different from 
yi and y 2 , and so on indefinitely. 
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(3) The cartesian product X xY of two countable sets X and Y is count- 
able . We have essentially proved this in showing above that Q is countable: 
one writes the ordered pairs of whole numbers (p, q) “diagonally” : 

( 0 , 0 ), ( 0 , 1 ), ( 1 , 0 ), ( 0 , 2 ), ( 1 , 1 ), ( 2 , 0 ),... . 

(4) The union of a finite or countable family of finite or countable sets is 
finite or countable. Indeed, let (Xi) ieI be such a family, where I is finite or 
countable like the X*. For each i choose a surjective map fi : N — ► Xi and 
define a map / of the cartesian product N x I onto the union X of the Xi as 
follows: 

= fi{n) 

for n G N and i £ I. For every x G X there exists an i £ I such that xGlj, 
so an n € N such that x = fi(n). The map / is thus surjective, and since the 
product N x / is countable, so is X finite or countable too. 

(5) Every infinite set contains a countable set If X is infinite there is a 
bijection / from X onto a set 7 C X distinct from X. Choose a £ Y — X 
and put 

x 0 = a, X\ = / (x 0 ), %2 = f{x i), .... 

If one had x p — x q for a pair of integers such that p < q, one could deduce 
that x p - 1 = x q - 1 since / is injective, whence, continuing this argument, 
xo = x q - p — f(x q -p- 1 ), which is impossible since xo £ Y — /(X). 

(6) Let X and D C X be two sets; suppose D that is countable and X — D 
is infinite; then X and X — D are equipotent. From the preceding result, 
X — D contains a countable set D' and one has 

X = YU(DU D f ), X - D = Y U D' 

where Y — X — ( DUD ') is disjoint from D and D f . It is not hard to construct 
a bijection g of Y onto itself; since D and D f are countable, so is DUD'; thus 
there is also a bijection h of D U D' onto D f . Then one obtains a bijection 
/ of X onto X — D by putting f(x) = g(x) [for example f(x) — x\ for all 
x E Y and f(x) = h(x) for all x £ D U £>'; / is clearly injective and 

f(X) = f[Y U(DU D')} = f(Y) U f(D U D f ) = Y U D' = X - D. 

(7) The set of finite subsets of a countable set X is countable. From point 
(4) above it is enough to show that, for a given n, the n-e lement subsets 
of a countable set X form a countable subset P n of V(X). Now consider 
the cartesian product X n , the set of systems (xi, . . . ,x n ) of elements of X, 
and the map / : X n — ► V(J () which transforms (xi, . . . ,x n ) into the set 
{xi, . . . , x n } C X. Its image clearly contains P n \ and it is countable since X n 
is, by (2) and (3); thus P n is too, since P n is clearly not finite. 
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8 — The different types of infinity 

Since there are “no more” rational numbers than whole numbers one can go 
further and ask whether there are not also “as many” whole numbers as real 
numbers (rational or irrational), which would allow one to “enumerate” all 
the points of a line. The answer is negative (Cantor, 1874). 

Let us confine ourselves to considering the set X — [0, 1] of numbers 
x such that 0 < x < 1. As we shall see in Chap. II, such a number can 
be written in decimal notation in the form x = 0 .x 1 X 2 X 3 . . . with “digits” 
xi,X 2 , . . . between 0 and 9; this expansion is unique if, for all n, one insists 
that the number O.xi . . . x n be strictly less than x, so that, for example, 1/4 
is written as 0.2499999 . . . and not 0.2500000 — That said, consider a map 
/ of N into X; for all n G N, denote by a n the n th digit of f(n) and, for all n, 
let us choose a b n ^ a n between 1 and 9. Consider the number b = 0.6 162 . . . 
whose n th digit is b n for every n. It belongs to X but not, as we shall see, to 
the image /(N) of N of /, whence the result: a map / of N into X is never 
surjective, let alone bijective. 

Indeed, suppose that b = f(n) for an n E N. Since the digits of b are all 
^ 0 , the decimal expansion b — O .&162 . . . cannot terminate in an unending 
sequence of zeros; thus O .61 . . . b p < b for all p, strict inequality, from which 
we see that this decimal expansion definitely satisfies the condition imposed 
above. If one had b — f(n) for a particular n, the n th digit b n of b would 
be, from the construction of 5, different from the n th digit of /(n), i.e. of b n ; 
which is absurd 24 . 

Since the set Q of rational numbers is countable, one sees that R — Q, the 
set of irrational numbers, is equipotent to R (n° 7, point 6 ). 

When there exists a bijection of R, the set of real numbers (geometrically: 
the set of points of a line), onto a set X, one says that X has the power of the 
continuum. One of the most paradoxical of Cantor’s results is that R x R has 
the power of the continuum; in other words, there exist bijective maps of the 
set of points of a line onto the set of points of a plane: there are “no more” 
points in a plane than on a line. Here again there is a very simple proof 25 , 
for example by using the binary counting of the computer scientists; others 
would say “of Leibniz” , but he had invented a calculating machine and also 

24 Cantor’s method for for showing that R is not equipotent to N is very different but 
presupposes some knowledge. Suppose that we could write all the real numbers 
between 0 and 1 as a sequence m,U 2 , . . ., and let us construct a sequence of 
compact intervals Ii D I2 D I3 D ... of lengths > 0 in [0, lj and such that 
Un £ I n for all n (if u n £ I n - 1 , choose I n — / n -i; if u n G In- 1 , choose for I n 
an interval contained in I n -i and not containing u n : this is possible since I n -i 
does not reduce to a single point). The results of Chap. Ill, n° 9 show that the 
I n have a point x in common; if one had x = u n for some n, then one would 
have x £ I n , which is absurd. 

25 That of Cantor, much more scholarly, exploits the classical theory of continued 
fractions. 
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was one of the precursors of formal logic, which, among other titles, makes 
him an honorary computer scientist. 

In binary counting, every real number between 0 and 1 can be written 
with the aid of a sequence of digits 0 and 1 , and in a unique way, if one insists 
that the sequence does not consist only of 0 from a certain point on. If one 
considers only those digits equal to 1, this amounts to writing x in the form 



x = 




1 

2 



Pl+P2 

+ 



1 

2 



PI+P2+P3 

+ ... 



with well- determined integers pi,P 2 >P 3 > • • • > 0 : the digits 1 in the “default” 
binary description of x are those of rank pi,pi + P2> etc., the others be- 
ing zeros; if for example one writes 1/2 in the form 0.01111... (and not 
0.1000...), one has p\ — 2, p n = 1 for all n > 1. Conversely, such a se- 
quence of integers defines a number between 0 and 1. In other words, there 
exists a bijection between the interval I = [0, 1] and the set S of sequences 26 
(pi,P 2 , . . .) of integers > 0. That said, let (x, y) be a pair of elements of I and 
let (pi,P 2 , . . .), (<7i,# 2> • • ♦) be the elements of S corresponding to x and y; 
we associate with the pair (x,y) the number z G I that corresponds to the 
sequence (pi, qi,P 2 ,Q 2 i • • •) obtained by interlacing the sequences correspond- 
ing to x and y ; thus one constructs, as one sees immediately, a bijective map 
of I x I into I (or, from the other point of view, of S x S into S), whence 
Cantor’s result. 

One would be wrong to believe that he saw all this immediately; to start 
with, it took him years to surmount the psychological obstacle which the 
implausibility of the results 27 and the predictable reactions of the majority 
of his contemporaries presented. The very simple proofs we have presented 
came later. 

Cantor, and others after him, believed for a long time that every infinite 
subset of R falls into one of the three categories which we have just defined, 
the finite, the countable and the power of the continuum; we know now that 
this “continuum hypothesis” is neither true nor false: one can neither deduce 

26 If one denotes the set of integers > 0 by X then S is precisely the set of maps 
from X into Y = N — {0}. A map of X into Y is a subset of X x Y, i.e. an 
element of V[X x T); the set S of these maps thus satisfies 

5 C V(V(X x Y)) C V{V{V(V(X U Y)))). 



27 Peano did much better than Cantor a little later: if one represents I as an interval 
of a line and I x I as a square in the plane, Peano constructed a map of I into 
I x I which is surjective and continuous. This amounts to the fact that a point 
moving in the plane in a continuous manner can, in a finite time, pass through 
ALL the points of a square. The Dutchman J. L. E. Brouwer later completed 
the statement: a map / of I into I x I can be continuous and surjective, but 
not continuous and bijective; the point has to pass through all the points of the 
square an infinite number of times. 
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it from the axioms of set theory invented by the logicians nor, if one adopts 
it, deduce a contradiction. You are very unlikely to need this very difficult 
result; the overwhelming majority of mathematicians die before using it. 



In proving these results by methods which, elementary as they are, were 
totally unknown before him, Cantor showed that there are different sorts 
of infinity , which twenty-five centuries of philosophers and theologians had 
apparently never discovered. One can even always compare them. A famous 
theorem (Schroder, 1896 and Bernstein, 1898 - it already appears in Cantor, 
but his proof leaves much to be desired) says that if A and Y are two sets, 
then there exists an injection of X into Y (i.e. X is equipotent to a subset 
of Y) or an injection of Y into X and that, if both these cases happen, then 
X and Y are equipotent. A convenient way of expressing this result is to 
attach to each set X a symbol Card(A), the cardinal of A, agreeing that the 
relation Card(A) = Card(Y) means that X and Y are equipotent and that 
the relation Card(A) < Card(Y) means that X is equipotent to a subset of 
Y, but not to Y itself; the symbol Card(A) thus plays the role of the “number 
of elements” of X. The theorem of Schroder-Bernstein can then be expressed 
as saying that if 28 Hi and H 2 are two cardinals, then one and only one of the 
three following cases occurs: 

Hi < H 2 , Hi = H 2 , H 2 < Hi. 

It is easy to construct infinite sets whose cardinals are increasingly large. 
If A is a set, the subsets of A are, as we saw above, the elements of a new 
set V(X). One can always construct an injection A — ► P(A), for example 
x 1 — > {x}, but A and V(X) are never equipotent , an “obvious” result if A 
is finite 29 . To see this, consider a map A — > 'P(A); this associates to each 
x G A a set M(x) C A. Let A C A be the set of x £ X such that x £ M(x) 

28 The use of Hebrew letters to baptise the cardinals goes back to Cantor and has 
made many believe that he was a Jew; and with such a name . . . His father, 
a prosperous merchant and cosmopolitan, was protestant and his mother, nee 
Marie Bohm, catholic and of a family of musicians. Their son was protestant, 
but his family link to Catholicism “may have made it easier for him to seek, 
later on, support for his philosophical ideas among Catholic thinkers” , his entry 
in the DSB tells us. In fact, Cantor’s father was indeed born to Jewish parents 
but converted before the birth (in Saint Petersburg) of the mathematician. The 
“Catholic thinkers” of the DSB were principally Jesuits, Fraenkel tells us in his 
biography of Cantor. Their interest in his ideas was not the greatest service they 
could render him . . . The Dictionary of Scientific Biography (DSB), Princeton 
UP, whose publication was directed by Charles Coulton Gillispie, an eminent 
specialist in the history of science in France of the XVIII th century and of the 
Revolution, comprises about twenty large format volumes in which one can find 
the essentials, even if the quality and the importance of articles, written by very 
many authors, varies greatly. 

29 If A has n elements, V{X) has 2 n . A subset Y of A is essentially the same 
as associating to each x £ X the number 1 if x £ Y, the number 0 if x £ Y. 
There are therefore as many subsets of A as ways of constructing a sequence of 
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and suppose that there exists an a £ X such that A — M(a) [as would be the 
case if the map x i — > M(x) of X into V(X) were surjective]. If a £ A, one has 
a £ M(a) = A by the definition of A , absurd; if a £ A, the relation a £ M(a) 
is false by the definition of A, whence a G M(a) = A, a new contradiction. 
Thus there cannot be a map of X into V{X) which is surjective, even less 
bijective, qed. This proof is due to Cantor, even as formulated here. 

The simplest illustration of the preceding result, though not very useful 
to us, is when X — N: the set V(N) is equipotent to R or, what comes to the 
same 30 , to the interval X : 0 < x < 1 in R. One proves this by associating to 
each subset A of N the number x G X whose n th binary digit is equal to 1 
if n G A and to 0 if not; one has to take note of the existence of numbers x 
which have two different binary expansions (0.1000 . . . = 0.0111 . . .), but they 
form a countable set since they are of the form p/2 n with p and n integers. 
The reader may provide the details as an exercise. 

However it may be, 

P(N), P(P(N)), V(V{V{ N))), 

etc., are sets whose “powers” become larger and larger, even though negligible 
in comparison with the analogous sets constructed starting from R. It is not 
advisable to plunge into contemplation of these vertiginous elaborations. One 
may also absolve oneself from reading the following n° which, for us, is only 
a gymnastic exercise in manipulating the symbols £ and C; but if you read 
and understand all the proofs, you will be liberated for the rest of your days 
from any inferiority complex as regards the mysteries of the “transfinite” . . . 

However one should not believe that these weird creatures have no practi- 
cal use in mathematics; on the contrary, they are used to prove indispensable 
theorems in functional analysis (the Hahn-Banach Theorem to mention only 
one) , in general topology (every cartesian product of compact spaces is com- 
pact), in algebra (existence of bases in any vector space), etc. 

9 — Ordinals and cardinals 

The von Neumann sets used above in defining the whole numbers have very 
curious properties. For a start, if X is such a set, then every element of X 
is also a subset of X; for example, the element {0, {0}, {0, {0}}} of the set 4 
has as its elements 0, {0} and {0, {0}}, which themselves belong to 4. One 
notes also that if a and b are two elements of X, then either a £ 6, or a = 6, 

n numbers equal to 0 or 1, and since there are two possible choices for each of n 
terms of such a sequence, one obtains 2 x 2 x . . . x 2 possibilities (application: 
coin tossing). More generally, if X has n elements and if Y has p, then the set 
of maps from X into Y has n p elements (same argument). 

30 Every school-leaver can tell you that x i — > x/(l 4- |cc|) is a bijection of R onto 
the interval I : — 1 < x < 1. Since the interval J : — 1 < x < 1 differs from I by 
only a countable (even finite) set, it is equipotent to /, thus to R. It remains to 
find a bijection of J onto the interval 0 < x < 1. 
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or b £ a and that the relation a G b is equivalent to {a C b and a ^ b}. If a, 
b, c are three elements of X and both a G b and b G c, then a G c. Finally 
one remarks that if X and Y are two integers defined a la von Neumann, 
the relation X < Y, which everyone knows, becomes X C Y, showing that 
inequalities between integers are reducible to membership relations or set 
inclusions. One also sees that if X and Y are two von Neumann integers, one 
always has X C Y or Y C X. 

Starting from these properties of von Neumann sets (or whole numbers), 
one can generalise by calling an ordinal (number or set) any set X possessing 
the following properties : 

(O 1) the relation x G X implies x C X; 

(O 2) for any a, b G X, one has either a G 6, or a = 6, or b G a. 

The simplest ordinal is naturally 0, but there are others, to start with the 
sets of von Neumann, who gave a slightly different definition of the ordinals, 
though it is equivalent to the one above. 

The set N is also an ordinal. To check (O 1), one remarks that every x G N 
is a von Neumann integer, from which all its elements are too, so belong to 
N, whence x C N. To check (O 2), one notes that any von Neumann integer 
is an element of all its successors and that if b ^ a is not one of the successors 
of a, then a is one of the successors of b (one need only be able to read and 
count . . .). 

These sets possess rather amusing properties; as in Euclidean geometry 
and even for the same reason - one constructs an entirely autonomous theory 
knowing hardly anything -, they are proved in a technically elementary way 
by applying the definitions. 

First note that (0 1) can be written in the form 

X C V(X) 



or in the form 



x GG X 



x G X 



since an element of a element of X is also an element of a subset of X, and 
so of X. 

Furthermore, the three cases in (O 2) are pairwise exclusive; for a G b and 
a — b would imply a G a, while a G b and b G a is, like a G a, forbidden by 
the providential axiom of regularity at the end of n° 2. 

(1) Every intersection of ordinals is an ordinal Clear from the definition. 

(2) If X is an ordinal, then s(X) = X U{X} is an ordinal Verification of 
(O 1): x G s(X) implies either x G X, whence x C X C s(X), or x = X and 
again x C s(X). Verification of (O 2): if a, b G s(X), one has either a G X 
and b G X, whence a G 6 or 5 = a or 5 G a since X is an ordinal, or a G X 
and b G {X}, whence b = X and thus a G 6, or a, b G {X}, whence a = b, 
qed. 
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(3) 0 G X for every nonempty ordinal X. By the axiom of regularity, 
there exists an a G X such that a H X = 0. But a C X by (O 1). Thus 
0 = a n X — a, qed. 

(4) For a, b G X, the relation a G b implies a C 6. If indeed x G a, one 
has either b G x, or b = x, or x G 6 by (O 2). In the first case, one would 
have x G a G b G x, impossible (axiom of regularity). In the second case, one 
would have a G 6 and b G a, impossible. In consequence, x G a implies x G 6, 
whence a C 6, qed. 

(5) Let A be a nonempty subset of X; let us show that there exists an 
a e A such that a C x for all x G A In other words: as a set of subsets of 
X, ever?/ nonempty subset A of X possesses a least element. By the axiom of 
regularity, there exists anaGi such that a n A — 0. For xGi, the relation 
x G a would imply x G A D a = 0, absurd; since x G a is impossible, one thus 
has either x = a or a G x, whence a C x in both cases, by (4), qed. 

(6) Every element X of an ordinal X is itself an ordinal By (4), x G Y 
implies x C Y, whence (O 1). If on the other hand x,y G X, one has also 
x, y G X since Y C X by (O 1); since X satisfies (0 2), a fortiori so does X, 
qed. 

(7) Let X and Y be two ordinals such that Y C X and X ^ Y; then 
Y G X and conversely . The converse is clear since it implies h C X by (4) 
and 7/X since otherwise one would have X G X. 

So suppose that 7cX and X — Y is nonempty; X — Y possesses a least 
element b by (5); we shall see that b = X, which will prove that 7 gX (and 
even that Y G X — X, in accordance with the axiom of regularity). 

First we shall show that b C Y. For all x G 6, one has either x G X — Y or 
x G Y. Since b is the smallest element of X — X, the first eventuality would 
imply b C x; but since b is an ordinal, by (6), and since x G b by hypothesis, 
one also has x C b; the relation x G X — X would thus imply x = 6, impossible 
since x G b. So we see that x G b implies x G X, whence b C X. 

Conversely, for all y G X, one has either b € y, or b = y, or y G b. X 
being an ordinal by (6), one has y C X. If 6 G ?/, one has 6 G X, absurd since 
6 G X — X. If 6 = y, one again has b G X since y G X implies y C X. The 
only possible case is thus the third, which shows that X C 6, qed. 

(8) Let X and X two ordinals; then either X c X or X C X. Let X = 
X n X, which is an ordinal by (1). If the theorem were false, one would have 
Z C X and X ^ X, so X G X by (7), and similarly Z G X, whence X G XfiX, 
i.e. Zg 7, qed. 

(9) Let X and X ftvo ordinals; one has either X G X, or X = X, or 
X G X. If in fact X C X one has either X = X, or X / X and so X G X by 
(7); one finishes the proof with the help of (8). 

(10) Let a be an element of an ordinal X; then either s(a) G X or s(a) = 
X. Since s(a) = X is an ordinal by (2), it is enough, by (9), to exclude the 
possibility that X G s(a) = a U {a}. But if such were the case, one would 
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have either X £ a and thus a £ X £ a, impossible, or X = a and then a £ a, 
impossible, qed. 

The two possibilities 5 (a) £ X and s(a) = X can very well arise. The 
second is trivially satisfied if one puts X = s(a) for an ordinal a, for example 
a von Neumann integer, or for a = N. For X = N, the first happens for any 
a £ X. 

One can finally prove, but it is much more difficult, that 

(11) Every set is equipotent to an ordinal It can be shown that this state- 
ment is equivalent to the axiom of choice. 

These properties explain the word “ordinal”. For consider an ordinal X 
and, for x,y £ X, let us write x < y if x £ y. Then we have the following 
statements, which everyone knows in the case of N: 

(i) x < y and y < z imply x < z; 

(ii) for any x, y, one and only one of the following relations is true: 

x <y, x = y, y <x; 

(iii) in every nonempty subset A of X, there exists an a such that a < x for 
all x £ A other than a. 

(i) follows from (7) since x<y, i.e. x£y, is equivalent to x C y and 
x 7 ^ y\ (ii) and (iii) are the properties (9) and (4). 

That said, let us consider an arbitrary set E and, thanks to (11), let us 
choose an ordinal X and a bijection / of E onto X. Given two elements x 
and y of E, let us agree to write that 

x <y /(x) < f(y). 

One thus obtains an order relation 31 on E , possessing the properties (i), 

(ii) and (iii): we describe this by saying that E is a well ordered set. There 
does indeed exist such an order relation on the set M of real numbers; but 
it clearly cannot be the one that everyone knows - this does not satisfy 

(iii) : the set of numbers > 0 does not possess a least element, this is the 

31 On a set E, an order relation is a relation xRy between the elements of E such 
that (a) one has xRx for all x £ E, (b) xRy and yRz imply xRz 1 (c) xRy and yRx 
imply x = y\ it is convenient to write x < y, and to write x < y when also x j - y. 
Examples: the (nonstrict) inequalities between whole numbers, or between real 
numbers, the inclusion relation between subsets of a set. When the conditions (i) 
and (ii) above are satisfied, one speaks of a total order - clearly not so in the case 
of inclusion - and of a well ordering when (iii) is also satisfied. This last concept 
was invented by Cantor whose two fundamental articles are available, with a 
long and interesting historical introduction in somewhat outdated language by 
its British translator: Georg Cantor, Contributions to the Founding of the Theory 
of Transfinite Numbers (Open Court Publishing Cy, 1915, reprinted Dover, 1955) 
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axiom of Archimedes of Chap. II. To “construct” such a well ordering one 
would have to establish a bijection / between R and an ordinal X having 
the power of the continuum. This amounts to writing the elements of R “one 
after the other” as one does traditionally for the whole numbers; one might 
first choose a sequence of elements of R, then an other sequence following 
the first, then a third following the second, etc., and since the union of a 
countable infinity of countable sets is again countable, one would not have 
exhausted R after having put these sequences “end to end” ; one would have 
to continue trans finitely, as Cantor put it. The general theory shows that 
such an ordinal X and such a bijection exist in the sense of the logicians, 
but since this result depends on the axiom of choice (and is equivalent to it), 
it is out of the question to exhibit either X or / by means of a reasonably 
explicit procedure. One might well think that if an order relation possessing 
the miraculous property (iii) were easy to see on R, mathematicians would 
not have waited until the XX th century to discover it. No one will ever see it. 

According to von Neumann, 1923, an ordinal is, by definition, a well 
ordered set subject to a supplementary condition which, apparently, no one 
had thought of before him: 

(iv) every x G X is the set of y € X such that y < x. 

For an ordinal defined by (O 1) and (O 2), the property (iv) is trivial: 
since y < x is equivalent to y G x this means that x is the set of y E X such 
that y G x. In von Neumann’s definition, which employs neither (O 1) nor 
(O 2), (iv) shows that every element of X is, as a set 32 , a subset of X and 
thus that y E x is equivalent to y < x. Condition (O 1) follows from this, and 
(O 2) is none other than the condition (ii) above since y G x is equivalent to 
y < x. Von Neumann’s definition is therefore equivalent to the one we have 
given. 

The fact that the successor of an ordinal is again an ordinal shows that 
the sets 

N, N U {N}, NU{N}U{NU{N}}, etc. 

are ordinals. On forming the union of the unending sequence formed by N 
and its successive successors, if one dares to put it so, one obtains a new 
ordinal to which one can again apply this process etc. Note in passing that 
all these sets are countable, so very modest, since there exist ordinals of all 
possible powers if one believes assertion (11). What distinguishes one from 
the other is not their cardinal, it is the order in which their elements are, 
at least virtually, written. To be precise, consider two equipotent ordinals X 
and Y and suppose that there exists a bijection / from X onto Y such that 
the relation x r < x" implies f(x') < f(x n ). Then X = Y (not an obvious 
result). 

32 We again recall that, even if it is not obvious, every mathematical object is a 
set. 




36 



I - Sets and Functions 



Cantor confined himself to considering well ordered sets, i.e. sets endowed 
with an order relation satisfying (i), (ii) and (iii), and considered two such 
sets as equivalent when there exists a bijection of the first onto the second 
which preserves the order of the elements, which is much more strict than 
the relation of equipotence; he did not think of imposing the condition (iv), 
but systematically associated to every element x of a well ordered set X the 
set of y G X such that y < x; this is not very different, since one can show 
that, for any well ordered set E, there exist an ordinal X and a bijection / 
of X onto E which transforms the inequalities in X into the inequalities in 
E ; further, X and / are determined uniquely by the order relation on E. 

The condition (iv) of von Neumann thus provides a perfectly determined 
“standard” in each equivalence class of well ordered sets; his definition of the 
whole numbers similarly provides, in the “class” of sets of fourteen elements, a 
Standard-Set of Fourteen Elements, namely the number 14. The physicists do 
this every day when they compare their folding metres to the iridioplatinum 
Standard-Metre lodged at the National Bureau of Standards at Washington, 
D.C. 

One should not be surprised that these ideas, introduced by Cantor about 
1895 in a quasi-philosophical and very obscure style, did not, at that time, 
arouse unanimous enthusiasm among his colleagues; but some adopted them 
immediately and tried to clarify them and to put them on a solid basis; in 
1900, at the International Congress of Mathematicians, David Hilbert, who 
was, with Henri Poincare, one of the two greatest mathematicians of the time, 
proposed to his colleagues his famous list of the most important and difficult 
problems of the age; “to prove the continuum hypothesis” was one; and one 
knows his opinion on set theory, the Paradise into which Cantor has enabled 
us to enter and which we shall never leave (I quote from memory). These 
eulogies did not prevent the unhappy Cantor from spending a large part of 
his last twenty years in psychiatric establishments. All his formalisable ideas 
have been adopted, but not much of the detail of his definitions and proofs 
has been retained, conceived as they were in an age when one still lacked a 
precise language, a convenient notation, and the strict logical discipline in- 
troduced by his . . . successive successors. 

The ordinals may be used to define the cardinals of n° 8 really as sets and 
not just as simple symbols. One cannot use the ordinals themselves since two 
different ordinals can well be equipotent, for example N and its successor. 
But in a given ordinal X the set of ordinals I'cl equipotent to X has, by 
(5), a least element Xq (a nonstandard notation; we remark in passing that 
the logicians use Greek letters a, /?, etc. to denote ordinals, to distinguish 
them from the Hebrew cardinals); X$ may be X itself, for example if X = N. 
If Y ^ X is an ordinal equipotent to X and if one supposes for example that 
Y C X in accordance with (8), one has Yo C Y C X and thus Xo C To by the 
definition of Xq. But then one has Xq C T and since Xq is equipotent to X , 
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and so to 7, one has Yo C Xo by the definition of Yo- One finds finally that 
Xo = Yo whenever X and Y are equipotent ordinals. If E is any set whatever, 
and if one chooses an ordinal X equipotent to E , the corresponding ordinal 
Xo does not depend on the choice of X and is equipotent to E. We can thus 
agree to define Card (E) = Xq; this is the standard-set in the class of sets 
equipotent to E. Exercise. An ordinal X is a cardinal if and only if X C Y 
for all ordinals Y equipotent to X. 

These results lie on the edge of the abyss: one step more and you fall into 
metaphysics, into mysticism or into contradictions, for example if you speak 
of “the set of ordinals”. Indeed, suppose that such a set existed, and give it 
the inevitable name: Q. Let us show that i? is again an ordinal. Every x G Q 
and every y E x being an ordinal by (6) and so an element of J?, one sees that 
x € £2 implies x C £2, whence (O 1). If now a and b are two distinct elements 
of £2, one has either a G 6, or b G a by (9), whence (O 2). To conclude, it 
remains to deduce that £2 E £2. Exercise. The union of a set of ordinals is an 
ordinal. 

We gave above the definition of finite sets according to Dedekind: the 
set X is finite if every injection X — > X is bijective. Many others were 
found after him, but it is not always easy to establish the equivalence of all 
these definitions, so we shall not attempt to do so. The Pole Alfred Tarski 
for example characterised the finite sets in 1924 as follows: every family (X$) 
of subsets of X should possess a minimal element, i.e. one not containing 
any other X* ( “proof” : choose an X* whose number of elements is minimal) . 
N is not finite, for if one denotes by X n the set of integers p > n, one has 
Xo = N D Xi D X 2 D . . . with strict inclusions; if N were finite, one would 
have X n = X n+ i = . . . from a certain integer n on. 

By reason of (11), this would also suffice to characterise the finite ordinals; 
for example: an ordinal X is finite if, for any nonempty ordinal Y C X, 
including Y — X, there exists an ordinal Z such that Y = Z U {Z}. The set 
X = N is not finite in this sense: the condition is satisfied for every Y strictly 
contained in X (since Y is a von Neumann set), but not for N itself: if one 
had N = Z U {Z}, one would have Z C N and Z N, so Z would be an 
element of N by property (6), i.e. a von Neumann set, so s(Z) = N also, and 
one would be led to the relation N G N. 

Finally we remark that Cantor and his successors had a lot of fun defining 
more or less algebraic operations on the cardinals, analogous to those known 
for the whole numbers; one obtains them starting from the operations of 
set theory. For example, one defines Card(X) + Card(F) = Card(X U Y) 
taking the precaution of assuming X and Y disjoint, Card(X)Card(T) = 
Card(X x T), Card(X) Card ( y ) = Card(F) where F is the set of maps from 
X into Y, etc. The formula Card(X)-f-l = Card(X) characterises the infinite 
sets and entrances the mystics, though not the financiers, they who will never 
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believe that if one possesses an infinite fortune it serves no purpose to continue 
to augment it. 
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§2. The logic of logicians 33 

In the version of logic more or less universally adopted the basic material 
comprises the following elements: 

(i) expressions which the Greek philosophers and geometers already used 
in plain language and which one represents by the sign ==> ( “implies” , “thus” , 
“it follows that”, “it results that”, “if ... then”, etc.), the sign V which 
resembles the sign |J and which one writes “or” (the nonexclusive logical 
disjunction of two assertions), and finally the sign -i, “not”, the negation of 
an assertion (“it is false that ...”); to these three fundamental signs one adds 
two other signs which are also useful but are only convenient abbreviations: 

P /\Q signifies : not [(not P) or (not Q)] y 

P 4=4> Q signifies : (P => Q ) /\(Q => P); 

we always write P or Q instead of P\J Q, P&Q instead of P f\Q and “not” 
instead of the sign there are already enough cabbalistic signs in Mathe- 
matics not to want to add to them; 

(ii) the quantifiers V (“for all” or “whatever”) and 3 (“there exists”, 
“there is at least one”), such as appear in the following statement: for any 
positive number x, there exists an number y such that x = y 2 (generally 
there exist even two . . .), which one writes 

(Vx[(xeR) & (x > 0)] => {3y[(y € R) & (y 2 = x)]}); 

by convention, 

33 My feeble attempts to find an exposition of logic, either in French or in English, at 
once accessible and readable, have not been fruitful, with one sole exception: Rene 
Cori and Daniel Lascar, Logique mathematique , transl. as Mathematical Logic : A 
Course with Exercises (OUP, 2001), which also presents the axiomatic theory of 
sets. Patrick Suppes, Axiomatic Set Theory (Van Nostrand, 1960, reprint Dover, 
1972), without comparing it to Cori and Lascar, has been of great use, but already 
assumes the reader to be familiar with the basic principles of logic and, indeed, 
with the naive theory and its usages. Paul R. Halmos, Naive Set Theory (Van 
Nostrand, 1960 or Springer, 1974) is very readable but ignores Logic completely, 
which is not necessarily an inconvenience to mathematicians. There are also the 
volumes on the Theory of Sets by N. Bourbaki; Cori and Lascar write that logic 
“is not the forte” of Bourbaki, which can be explained by the fact that the 
treatise was written by mathematicians for mathematicians. Since, before the 
war, only two people were interested in logic in France - Jacques Herbrand and 
the philosopher Jean Cavailles - who both departed prematurely (the first in 
a mountaineering accident, resistance to German occupation took the second), 
one should at least ascribe the credit to Bourbaki for having spread the subject 
widely in France, even if the French Cardinals of logic have a much more elaborate 
conception of it, as is normal. As far as I am concerned, reading Bourbaki’s 
Fascicule de resultats on Set Theory as soon as it was published at the beginning 
of the 1940s allowed me to learn in a few weeks everything I ever needed in this 
respect. 
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(3xP) signifies not[Vx(not P)]; 

(iii) variables x, y , . . . , X, 1", . . . , a, 6, . . . etc., in unlimited quantity, sym- 
bolising totally indeterminate objects; 

(iv) logical punctuation signs such as (, ), [, ], etc., whose purpose is 
to group one or other of the sub-propositions of which a given proposition 
is composed. The professional logicians use these much less than we have 
done 34 , but we are not addressing them. 

In particular, the parentheses allow us to indicate the domains of applica- 
tion of quantifiers clearly. This last point leads to the fundamental distinction 
between free variables and bound variables : a variable x is said to be bound 
if it appears in the domain of application of a quantifier Vx or 3x, whose 
domain of application is, in principle, delimited by the parentheses ( and ); 
a variable is said to be free if it is not bound. Although the professional logi- 
cians do not hesitate to write statements in which the same letter x is bound 
in certain parts of a proposition and free in others, for example 

(Vx(x 2 > 0)) & ((x > 1) =4> (x 2 > x)) , 

one can only discourage the reader unfamiliar with logic from this usage; it 
is much more prudent to write the preceding proposition in the form 

[Vx(x 2 > 0)] & {(y > 1) => (y 2 > y)} 

to make clear the fact that it is composed of two unrelated assertions. 

The use of these symbols 35 allows one to construct “propositions” or 
assertions; these are finite sequences of letters and of signs taken from the 
list above and a priori chosen arbitrarily, for example 

(\fz((za)or((3y)b) => not(x(3y)x) 

It goes without saying that to obtain meaningful sequences (not the case in 
the example above), one must observe a certain number of generally obvious 
rules of syntax. To be precise, the only syntactically correct propositions 36 
are those one can obtain by repeated application of the following rules or 
propositions, in which P and Q stand for propositions or assemblages of 
signs which are already known to be syntactically correct: 

34 The parentheses ( and ) are in fact enough, as anyone will know who, for example, 
has consulted a computerised catalogue in the library using key words. 

35 To respect strict logical formalism is not in my programme - anyway I would 
be incapable of it - apart from implication arrows and the signs “&” and “or” 
which will serve sometimes to eliminate every ambiguity in a statement. Nor 
is it to encourage the reader to avail himself of simple stenographic signs which 
permit him, as one has seen so often, to write gobbledegook instead of expressing 
himself plainly. 

36 This does not mean “true”. The relation 1 = 2 is syntactically correct. Likewise, 
the syllogism “every man is immortal, Socrates is a man, therefore Socrates is 
immortal” is perfectly correct. 
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(a) not P, 

(b) P or Q, 

(c) P^Q, 

(d) (Vx(P)) where x appears in P as a free variable. 



Some remarks need to be made on (d), the rules of formation (a), (b), 

(c) presenting no problem other than that of determining the “primitive” 
propositions starting from which all the others are to be formed with the aid 
of the four preceding schemata. If P{x} is a proposition involving a variable 
x representing an a priori indeterminate object, and maybe other variables, 
the expressions 

(VxP{x» , ( 3xP{x }) 

are to be read, the first “for all x, P{x}” or “one has P{x} for all x”, the 
second “there exists x such that P{x}” or “one has P{x} for some x”, “an 
x” signifying “at least one x”. If P involves other variables y,z, . . . than x, 
the assertion (VxP{x, y, z}) obtained by applying the quantifier Vx to P is 
again an assertion involving the variables y, z, . . . but in which x has become 
a bound variable. 

Since the free variables represent totally arbitrary objects, you can, in a 
formula featuring the free variables y,z, . . ., replace them by other distinct 
variables u, u, . . ., free or bound, apart from the y, z, . . . which appear in the 
proposition: the assertions [(x > y) => (x+1 > y)\ and [(x > z) => (x-fl > 
z)] are logically equivalent; on the contrary, the propositions [(x > y) => 
(x + 1 > y)\ and [(y > y) => (y - h 1 > y)} clearly are not; it is similarly 
obvious that (3x(x Gy)) is not equivalent to (3x(x G x)). 

We shall often say that a bound variable x is a phantom variable since one 
can replace the letter x everywhere by a sign with no logical or mathematical 
meaning, for example the signs $, %, □, etc., with which at a stroke one can 
replace x in the quantifier Vx. For example, it is clear that the proposition 

[(x G R) & (x 2 = 4) & (y + 1 > y)} 
really contains “arbitrary variables” x and y, but that in the assertion 
(3x[(x Gl) & (x 2 = 4) & (y + 1 > y)]), 
the variable x does not play the same role as y; one could equally well write 



(3 □[(□ G M) & (□ 2 = 4 & (y + 1 > y)] 
as Bourbaki does. In the notation 

X = \J Xi 

iei 
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used to represent a union of sets, the letter i is, despite the absence of a visible 
quantifier, a bound variable; the preceding definition is in fact an abbreviated 
way of writing 



(yx{(xeX)4=^(3i)[(ieI) & (xeXi)]}). 

You may therefore replace the letter i with the sign □. 

For the rest, if x were again a free variable in the relation (VxP{x}), one 
could again apply a quantifier to it, which would lead one to idiocies such as 

(Vx(3x[(x e R) & (x 2 = 4)])) 

or, in plain language, “for all x, there exists a real number x such that x 2 = 4” ; 
this is as if you were to say “for every man, there exists a man called Socrates” ; 
phrases like this have no more meaning in logic than in ordinary language. 
It is to avoid these mistakes that, in the rule (d), one insists that x be a 
free variable in P : one has no right to quantify the same variable twice in 
succession for the excellent reason that this “right” is not inscribed in the 
constitution of the Empire. 

The “true” propositions are thus those one obtains by repeated appli- 
cation of the rules (a), ... , (d) starting from a small number of explicitly 
formulated schemata of propositions considered true a priori . To obtain the 
classical syllogisms, it suffices to assume a priori that the four following types 
of relations are true: 

(P or P) => P, P =* (P or Q), (P or Q) => (Q or P), 

(P => Q) => [(P or R) =*► (Q or R)}, 

where P, Q, R are any propositions. We can easily deduce other types of 
valid relations from these, for example that if P, Q and R are propositions, 
the proposition 



[(P=>Q) & (Q =* P)] =* (P =* P) 

is true (even if the assertions P => Q , Q =$> R are not), similarly for the 
proposition 

not (not P) P 

for any P. 

As far as the quantifiers are concerned, the essential point is that if the 
relation (Vx)P{x, y,z} is true and if A is a mathematical object, then the 
relation P{A, y, z}, obtained by substituting the definition of A for the letter 
x everywhere in P, is again true. Correlatively, if P{A,y,zj is true for a 
certain object A , then the relation (3x)P{x, y, z } is true. 
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To construct the objects - sets - and relations which are the study of 
mathematics, and not only logic, one needs, as we have seen, the three extra 
symbols = [which the logicians tend to annex, subjecting it to the axiom 
(\/x)(x = x))], G and 0, and a certain number of axioms, those set out in 
the first part of this chapter; some permit the construction of new sets - of 
unions, of pairs, of sets of sets, etc. - starting from given sets and relations; 
others permit us to prove relations - equality, membership, inclusion, etc. - 
between sets. As the use of logical and mathematical signs leads as well to 
propositions as to sets, one needs to have a definition of sets , say, for example: 

A is a set 4=> {(A = 0) or [3x(x G A)]}. 

To expound all this in strictly formalised language as the logicians do would 
be pointless and unusable. Indeed, one learns the usage of set theory from 
practice, and this requires only a little acquaintance with the subject. 

It is now time to embark on true mathematics , where one has the agreeable 
illusion of manipulating other things than boxes filled with emptiness, or 
boxes filled with boxes filled with emptiness, or ... - of mathematics which 
would never have interested anybody if not for this illusion and its surprising 
appropriateness to “reality” . 
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§i. Convergent sequences and series - §2. Absolutely convergent 
series - §5. First concepts of analytic functions 



§1. Convergent sequences and series 

0 — Introduction: what is a real number? 

Throughout this book, we shall write N for the set of “natural” integers, i.e. 
those > 0; Z for that of the “rational” integers, i.e. of any sign; Q for the set 
of rational numbers (quotients of two integers); R for the set of real numbers; 
and C for the set of complex numbers x + iy with x,y e R and i 2 = —1: of 
course NcZcQcRcC. 

Since we hope that the reader is relatively familiar with N, Z and Q, we 
shall emphasise the construction of real numbers, though without providing 
all the details, then briefly recall that of C, which is much simpler. 

The so-called “real” numbers - it is too late to change the terminology - 
are truly not to be met in physical reality; they were born in the brains of 
mathematicians. The event which precipitated this process was the discovery 
by the Pythagoreans, in the V th century before our era, of the fact that the 
ratio ^(1 -f V5) between the diagonal and the side of a regular pentagon - 
their emblem - and, later, the numbers \/2, a/ 3, etc. are not rational; if for 
example one had y/2 = p/q where p and q are two integers and not both even 
(if not, simplify!), then the relation p 2 = 2 q 2 shows that p is even, and so 4 
divides the right hand side, whence q is even: contradiction, absolute horror! 
For mathematicians. 

The Greeks of this period, like their Babylonian, Indian or Egyptian pre- 
decessors, only knew of fractions: the successors of the Pythagoreans, until 
Euclid, were forced to develop a very abstruse theory of the (positive) real 
numbers, grounded in the “measure of magnitudes” : to say that the number 1 

1 The notation n was introduced unsuccessfully in 1706 by an Englishman, and 
independently by Euler in 1739; since everyone read him, the usage spread. For 
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7 r is the ratio between the length of a circumference and that of the diame- 
ter, presumes that we have a mathematically exact definition of “lengths”, 
and not just a cadastral or physical one; which is far from obvious. Since, 
what is more, the modern algebraic notation had not then been developed, 
everything had to be explained geometrically. The mathematicians of the 
Renaissance and of the XVII th century inherited this point of view, too rig- 
orous for the age; the invention of the differential and integral calculus about 
1665-1700 made it disappear, banished it to a lower level, to the benefit of 
the prodigiously effective quasimechanical methods of Calculus, whose lack 
of rigour would certainly have scandalised the Greeks. One had to wait until 
the second half of the XIX th century to clarify and simplify, by reversing the 
procedure: the real numbers were then defined by procedures as rigorous as 
those of arithmetic, and one was then able to define precisely the length of a 
curve, the area of a surface, etc. In other words, geometry was no longer the 
foundation of analysis, but the other way round: even though, of course, the 
first still continues to provide indispensable intuitions. 

We spoke above of “so-called” real numbers; why so-called? One might 
insist that a magnitude to be measured, say the diagonal of a square, or 
a circumference, is an eminently real object. But it all arises, in the final 
analysis, from the desire to arrive at an absolute exactitude which certainly 
exists in the minds of mathematicians, if not in physical reality, where, even 
leaving aside the non-euclidean character of the universe, one never meets 
“points”, “lines”, “squares” or “circumferences” in the mathematical sense 
of the term. Apart perhaps from the whole numbers, mathematical objects, 
starting from the numbers we call real, are only, at best, idealised models 
of real objects. Add to this the essential fact that it is impossible to define 
an irrational number 2 without the intervention of infinitely many elementary 
arithmetical operations or of rational numbers, a situation which does not 
arise in “reality” or “Nature”, and even less, if this were possible, in the 
experimental sciences. 

True, physicists, engineers, etc. constantly use the number 7 r, with only 
a bare mention, not bothering to reflect on its exact mathematical meaning. 

the history and the construction of the rational numbers, real or complex, of 
7 r and other more advanced subjects, see H. D. Ebbinghaus et al., Numbers 
(Springer, 1991), a book to be recommended from all points of view. 

2 The real and complex numbers divide into two categories. First there are the 
algebraic numbers which, by definition, satisfy algebraic equations with rational 
coefficients: the rational numbers, and those obtained by extracting roots (for 
example z, v^2, \/5 ), the roots of the equation x 1848 — 3.14159ad 789 4- 2.718 = 0, 
etc. This set is a field included between Q and C. The other numbers are called 
transcendental ; 7 r is one such. The set of algebraic numbers is countable, but not 
the set of transcendental numbers, so not R. A simple procedure for constructing 
transcendental numbers was discovered by Joseph Liouville in 1844: suppose that, 
for n large, the n th decimal digit is ^ 0 if and only there exists a p G N such that 
n — 1.2.3 ...p — p\. See for example Christian Houzel, Analyse mathematique 
(Belin, 1996), p. 64. 
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But for them, in reality only the first four, ten or twenty-five decimals of 7 r 
matter, since their calculations are intended only to lead to experimentally 
verifiable formulae. The value of mathematics to its users is to provide sys- 
tematic procedures for calculating the numbers they need to an arbitrarily 
high accuracy. At the end of the XVII th century, an English astronomer, John 
Machin, used the formula 

7t/4 = 4. arctan(l/5) — arctan(l/239) = 

= 4(1/5 - 1/3. 5 3 + 1/5. 5 s - 1/7.5 7 + . . .) 

- (1/239 - 1/3. 239 s + 1/5.239 5 - . . .) 

to calculate 7r to 100 decimal places: here there are two sums with infinitely 
many terms, i.e. two series; clearly one cannot calculate their sums exactly: 
one takes only a finite, though sufficiently large, number of terms in these 
sums. This formula, mathematically exact if one knows how to give a precise 
meaning to an infinite sum, provides a systematic procedure for calculating 
as many decimal places as one wishes of the number 7r. In other words, it 
is the scheme of a numerical calculation pushed to infinity , a scheme which, 
again, exists only in the minds of mathematicians and which the “users” 
can always interrupt when it has provided the needed places of the complete 
result. Those who complain of the “pedantry” of mathematicians have simply 
not understood the problem. Without us, they would drag the letter 7r along 
in their formulae (or, better still, its 25 first decimals until they need the 150 
following - after all, there exist tables of log to 100 decimals, probably not 
calculated simply for pleasure) without knowing what it represents. 

At the present time the simplest way to define the real numbers might be 
to say that they are “nonterminating decimal expansions”, like the number 
7 r = 3.14159 . . . This supposes that we know all the decimals of the number 7 r; 
a vast programme which some pursue, not in the hope of arriving at the end, 
to be sure, but in the hope, probably also illusory, of showing that the statis- 
tical distribution of the decimals of 7 r is not the result of a process analogous 
to a random draw: if such were the case, one could deduce mathematically 
demonstrable conjectures. 

Apart from this particular case, which, after all, has never brought any- 
body to a halt, what sort of mathematical object does a nonterminating 
decimal expansion 

X — X0.X1X2 . . . 

represent? So long as one has not defined the real numbers clearly and pre- 
cisely and proved their existence as mathematical objects, such an expansion 
is only a sequence of numbers xo, xi, . . . , where xo is any rational integer, the 
“integer part ” of the pseudo-number x, and the rest are integers between 0 
and 9, the “decimals” of x; as much to say a pattern on a strip of paper of 
infinite length. One might also consider the preceding formula as a condensed 
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way of representing an increasing sequence of decimal numbers 3 , namely a?o; 
Xo-Xi; X 0 .X 1 X 2 ; ... and agree that, by definition, a real number is such an 
expansion or such an increasing sequence. To say that it is the limit of such 
an increasing sequence would be a perfect vicious circle: one would first have 
to define what a limit is, which is precisely the first aim of analysis (and of 
this chapter), and further to prove that it exists, which assumes that all the 
problems have been solved. Some mediaeval theologians “proved” the exis- 
tence of God by observing that existence is one of the divine qualities that 
figure in His definition. One would not get very far in mathematics by such 
methods. 

This definition of the real numbers brings up other tiresome problems. The 
first is that one has to explain why, for example, the expansions 1.0000. . . 
and 0.9999 . . . define the same number 1. The second, much more serious, is 
that one has to define the sum and the product of two real numbers. 

Everyone knows how to add or multiply decimal numbers that have a finite 
number of nonzero digits. For the sum, for example, one adds the decimals 
of the same rank, starting with the last digits to the right, and carrying to 
the next place when the sum exceeds 9. If one tries to apply this rule of 
commercial arithmetic to nonterminating expansions, one comes against a 
little obstacle: there is no last digit to the right. One might try to start from 
the left, but each new addition may have repercussions on all the preceding. In 
short, it is a practical impossibility to define addition in this way, and even 
less multiplication; nor to proving the rules of calculation which everyone 
expects - associativity, distributivity, etc. -, without relying on an “it is 
obvious that ...” or an “everyone knows that . . .” ; but mathematics is not 
based on rumours, no matter how ancient. This will not prevent us, in the 
sequel, from sometimes using decimal expansions to explain - explain, and 
not to prove - some theorem or proof, but one cannot deduce anything more 
without ridiculous contortions - or to resorting to swindles as I myself very 
deliberately did, to avoid complications, in my course in Algiers in 1964 4 . 

One mathematically correct method - there are others - for defining the 
positive real numbers (the negative numbers then come from the usual al- 
gebraic procedures) consists of introducing particular sets in the set Q + of 
rational numbers > 0, the cuts: a nonempty subset X of Q + is a cut if it 
satisfies the following conditions 5 : 

3 A decimal number is the quotient of an integer by a power of 10, so its decimal 
expression has only a finite number of nonzero digits. 2/10 is a decimal number, 
but 2/3 = 0.6666 . . . not, even though it is rational. 

4 Introduction a V analyse mathematique (Union nationale des etudiants algeriens, 
1964). See also Calcul infinitesimal in the Encyclopaedia Universalis. 

5 The condition (c) is automatically satisfied for every cut that defines an irrational 
number, but if one omits it one finds that the number 1, for example, corresponds 
to two different cuts: the set of x £ Q + such that x < 1, and the set of x £ Q+ 
such that x < 1. The definition adopted here finesses the need to distinguish the 
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(a) it is bounded above, i.e. there exist numbers in Q greater than all the 
x G X, 

(b) the relations x G X and 0 < y < x imply y G X, 

(c) for every nonzero element x of X there exists at/Gl such that x < y. 

Intuitively, and leaving aside the set X = {0} - which satisfies (a), (b) and 
(c) -, a cut is the set of all the positive rational numbers which are strictly 
less than a given (possibly rational) positive real number, in other words, the 
set of all its rational positive approximations from below, to an unspecified 
precision: the number 10“ 123 is a rational approximation from below to the 
number ix as much as 3.14159 is. 

The correct definition of the real numbers to which we have alluded then 
consists of saying that a positive real number is purely and simply a cut; 
in other words, that there is no difference in nature between a real number 
and the set of all the rational numbers which are strictly less than it. This 
definition suggests no mystical, metaphysical or physical interpretation of the 
number 7 r for example, but it allows one to argue rationally about all the real 
numbers, even the irrational ones. For a start, one can very simply define the 
two fundamental algebraic operations and the inequality relation from this: 

(1) the sum X -f Y of two cuts is the set of z = x -f y with x G X and y G T; 

(2) the product XT of two cuts is the set of z — xy with x G X, y G T; 

(3) the inequality X < Y means simply that X C Y. 

A posteriori explanations: (1) and (2) mean that if a and b are two real 
positive numbers, then every rational positive number < a + b (resp. ab) 
is the sum (resp. the product) of two rational positive numbers < a and 
< b respectively. (3) means that if a and b are two real numbers, the weak 
inequality a < b is equivalent to the fact that every rational x < a is also < b 
(strict inequalities). 

Naturally one has to verify that these definitions lead to sets satisfying 
(a), (b) and (c), then that the “obvious” expected properties of addition, 
of multiplication and of inequalities (see the following n°) are satisfied. One 
must also show that every rational positive number x can be considered as 
a real number: to do this one associates to it the set C(x) of y G Q+ such 
that y < x if x > 0, or else the cut {0} if x = 0. All this requires only most 
elementary reasoning about the rational numbers, patience, and sometimes a 
little ingenuity. This done, one can forget this quite abstract construction for 
defining the real numbers and, as everyone has always done, restrict oneself to 
reasoning from the fundamental properties that we will state in n° 1 . The only 
interest in this construction is to prove the existence of a mathematical object 

rational numbers from the irrational in the construction of R. The concept of a 
cut, in Dedekind, has a slightly different meaning: for him, it is a partition of Q 
into two nonempty sets X and Y such that x < y for all x G X and all y G Y; 
there is then a unique real number “between” X and Y. Exercise : deduce from 
(b) that x < y if x G X and y £ X, y > 0. 




50 



II - Convergence: Discrete variables 



- the set of real numbers, contained in P(Q), - possessing these properties 
(Chap. I, n° 4). 

This is a construction of a “modern mathematical” type, very much less 
intuitive than writing a nonterminating sequence of digits onto a strip of 
paper of infinite length and keeping on moving always further right along 
the line in the eternally frustrated hope of arriving one day at the aim of 
your quest, the Grail: the last digit to the right. It at least has the advantage 
of being mathematically correct, which could be the reason why Richard 
Dedekind (1831-1916), a through and through algebraist (theory of fields of 
algebraic numbers) who would not confuse mathematics and physics, invented 
it in 1858 and published it in 1872; one might consider this the birth date of 
modernity in mathematics, which consists of constructing all mathematical 
objects with the aid of logic and set theory and establishing their properties 
starting from there. 

This construction strips the real numbers, for example 7 r, of all their mys- 
tery: reasoning about 7 r becomes, in this perspective, reasoning about all the 
rational numbers < 7 r. All the calculators, and in particular the physicists 
and engineers, have always done this, since they replace the nonterminating 
decimal expansions of 7T by, for example, its 25 first digits. But the genius 
of Dedekind’s idea was to see that instead of privileging one or other more 
or less arbitrary or artificial procedure of approximating a real number by 
rational numbers, it was simpler to identify it with the totality of its rational 
approximations from below, i.e. to a subset of the set Q of rational numbers, 
and to use this to define the algebraic operations and inequalities in M. 

For the reader’s convenience we recall briefly the definition of complex 
numbers which we shall use constantly, a much easier enterprise than defining 
the real numbers. 

It is often believed that they were invented to provide roots for the 
quadratic equations ax 2 4- bx -F c = 0 when b 2 — 4ac < 0. This is not so: 
the Italians of the XVI th century invented them because having found mirac- 
ulous formulae for solving third degree equations, they discovered that these 
formulae, although sometimes featuring square roots of negative numbers 
thus apparently “impossible” - nevertheless provided a real root 6 when one 
substituted the formula in the equation calculating a la Bertrand Russell, 
i.e. not knowing of what one speaks. They were thus led to introduce new 
“numbers” of the form a + 6\/— T, where a and b are usual numbers, and to 
calculate mechanically with them, bearing in mind the “fact” that the square 
of \/~d is equal to —1. Later, Euler introduced the convention of denoting 
this strange number by the letter i ; it took a long time before everyone, and 
particularly the “users”, adopted it. 

An equation of odd degree with real coefficients always possesses at least one 
real root. 



6 
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Of course one could, as some authors do for R, state axiomatically the 
existence of a set C in which are given a particular element denoted by i and 
two operations, addition and multiplication, satisfying certain conditions: 

(Cl): C is a field, in other words the rules (LI) to (1.9) of algebraic calcula- 
tion of the following n° are valid; 

(C 2): R is a subset of C, and the algebraic operations defined in C coincide, 
in R, with those already known (in other words, R is a “sub- field” of 
C); 

(C 3): i 2 - -1; 

(C 4): every z £ C can be written as z — x + iy with x,y £ R. 

The rule (C 4) is reasonable since the others show that the sum, the product 
and the quotient of two complex numbers of the form x + iy are again of the 
same type. Note also that the expression z = x + iy is necessarily unique, 
since, if this were not the case, one could find, by subtraction, a relation of 
the form a + ib = 0 with a and b real and not both nonzero, whence it would 
follow, if a ^ 0, that i — 0, absurd, or else that i = ab~ x is a real number 
with square equal to —1, also absurd. 

All this has for a long time satisfied the mathematicians and a fortiori the 
users, but does not explain whence - Heavens? - this mysterious “number” 
i comes. Young students have been told hundreds of times during years that 
no such number exists, but since the teacher and the textbook say it does, 
why should they try to understand? 

Instead of defining complex numbers as if mathematics had made no 
progress whatsoever during the last 450 years, it is much better, and very 
easy, to demystify the situation. Now, it has been the usage since the begin- 
ning of the XIX th century and Gauss to represent a complex number x + iy 
geometrically by the point of the plane with rectangular coordinates x and y ; 
the point which one always identifies with the ordered pair (x,y) of real 
numbers. Even if, for supposedly pedagogical reasons, you define complex 
numbers as expressions of the form x + iy, you always end up representing 
them by ordered pairs of real numbers. Why not, then, so define them to 
begin with? Complex numbers then become mathematical objects obtained 
from real numbers by a perfectly standard set-theoretic operation. 

This method, invented 7 in 1835 by William Rowan Hamilton, though not 
widely appreciated for a century and still rarely used in elementary textbooks, 
therefore is to declare that a complex number is, by definition, an ordered pair 
(a, b) of ordinary real numbers, to define equality and the two fundamental 
operations on such pairs by the formulae 

(0.1) (a, b) = (c, d) a = c & b = d, 

(0.2) (n, 6) T (c, d) — (a T c, b Y d) 

7 For the history of complex numbers, see the chapter by R. Remmert in H. D. 
Ebbinghaus et al., Numbers (Springer, 1991). Hamilton’s method is the simplest 
case of vastly more general constructions. 
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(0.3) (a, b).(c, d) = (ac — bd,ad -\- be), 

and to prove that conditions (C 1) to (C 4) above are satisfied 8 . This requires 
only the simplest algebraic calculations and, for beginners, may even be a 
good exercise in elementary algebra. 

Suppose for instance we want to check the associativity of multiplication 
(1.5), which in C is the identity 

[(a,b)(a',b')}(a"X) = (a, 6)[(a', &')(«"> b'% 

Applying (0.3) blindly, this first reduces to 

• (aa' - bb', ab f + ba')(a", b") = (a, b)(a'a" - b'b ", a'b" + b'a"), 

then, again by (0.3), to 

((aa' - bb')a " - ( ab ' + ba')b", ( aa ' - bb')b" + (ab' + ba')a") = 

= (a(a'a" - b'b") - b(a'b" 4- b'a"),a(a'b" + b'a") + b(a'a" - b'b")). 

Applying rule (1.9) of n° 1 for real numbers, we are reduced to proving that 

(aa')a" - ( bb')a " - (ab')b" - (ba')b", ( aa')b " - (bb')b" + (ab')a" + ( ba')a ") = 
- (a(a'a") - a(b'b") - b(a'b") - b(b'a"), a(a'b") + a(b'a ") + b(a'a") - b(b'b")) : 

and the result then follows from the associativity rules (1.5) for real numbers. 
Rules (1.1), (L2), (1.6) and (1.9) for complex numbers are proved in similar 
ways. 

The existence of complex numbers “zero” and “one” is clear: they are the 
pairs (0, 0) and (1,0). The “opposite” of (a, b) is obviously (—a, —6). To prove 
(1.8), i.e. that we can solve ( a,b)(x,y ) = (1,0) for any pair (a, 6) ^ (0,0), we 
write this as 

(0.4) ax — by = 1, ay + bx = 0; 

if b — 0, in which case a ^ 0, the pair (1/a, 0) is a solution by (1.8) for M. If 
b 7 ^ 0, the second relation is equivalent to x = — ay/b , hence the first to 

(a 2 + b 2 )y = -b : 

and in R this can be solved in one and only one way provided that a 2 -\-b 2 ^ 0. 
But since 6 / 0, we may write a — be for some c € R, and the relation 
a 2 + b 2 = 0 then is equivalent to 

8 These are easy to understand when one sees that the pair (a, 6) must after all 
represent the sought-for symbol a 4- ib: 

(a + ib) + (c + id) = a + b + i(c + d), 

(a + ib)(c + id) = ac + ibc 4- aid + i 2 bd = ac — bd + i(ad + be). 
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c 2 + 1 — 0; 

it is thus because this equation has no solution in R that every non-zero 
complex number has an inverse in C. 

To conclude the construction, we observe that the mapping x » (x, 0) of 
R into C is injective and transforms addition and multiplication in R into the 
corresponding operations in C. We may therefore agree to identify each real 
number x with the couple (x,0). Since we have 

(o, i)(y, 0) = (0, y) 

by rule (0.3), rule (0.2) then proves that 

(x, y) = (x, 0) 4- (0, 1 )(y, 0 ) = x + iy 



where we define 

i = ( 0,1). 

Again by rule (0.3), we then have 

(0, 1)(0, 1) = (—1, 0), 
i.e. i 2 = —1. No mysteries anymore. 

Having done these constructions and verifications, you may forget them, 
and start computing mechanically as Euler was doing in 1750. 

The geometric representation of complex numbers x + iy by points (x, y) 
(or by vectors of origin O) of a plane allows one to introduce the modulus or 
the absolute value 

\x -f iy\ = Vx 2 + y 2 

of a complex number. On introducing the conjugate z — x — iy of z — x + iy, 
one finds immediately that 

22 = 1 2 1 2 , 

whence \z'z"\ = | 2 /||V'| and 1/z — z/\z\ 2 . These summary indications - 
plus, of course, the habit of computing complex numbers, which is acquired 
by practice - will suffice for nearly all our needs. 

To conclude, let us state the wonderful property of complex numbers that 
explains their importance everywhere in mathematics: any algebraic equation, 
of any degree, with complex coefficients, has complex roots. This cannot be 
proved by purely algebraic methods; analysis is needed, as will be seen in 
Chap. VII, n° 18. 



1 — Algebraic operations and the order relation: axioms of R 

For those who have faith, one can consider the more fundamental properties 
of the real numbers as axioms to be accepted without worrying about the 
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“nature” , concrete or metaphysical, of the objects about which one is reason- 
ing; an eminently “modern” approach: you have probably met it already in 
Euclid’s geometry. 

We can class these axioms in four groups. 

First of all there is a group of purely algebraic formulae concerning the 
two fundamental operations; they apply to the rational numbers, to the real 
numbers and to the complex numbers and state that, endowed with these two 
operations, 1 or C is, like Q, a commutative field (as one says in algebra): 

(1.1) x + (y + z) = (x 4- y) + 2 for all x, y, z\ 

(1.2) x 4- y = y -f x for all x, y; 

(1.3) there exists an element 0 such that 0 -f x = x for all x; 

(1.4) given x there exists a y such that x + y = 0; 

(1.5) x(yz) = ( xy)z for all x, y, z\ 

(1.6) xy = yx for all x, y; 

(1.7) there exists an element 1^0 such that 1.x = x for all x; 

(1.8) for each x ^ 0 there exists a y such that xy = 1; 

(1.9) x(y -f z) = xy 4- xz for all x, y, 2 . 

It is not the role of an exposition of analysis to develop the consequences 
of these axioms, but among the “remarkable identities” of algebra there is 
one which we shall use often, namely the binomial formula , which generalises 
the relation (x + y) 2 = x 2 + 2 xy + y 2 : 

(1.1) (x + y) n = x n + nx n_1 y/l! + n(n - l)x n ~ 2 y 2 /2\ + 

+ n(n - 1 )(n - 2)x n “V/3! + . . . + y n 



( 1 . 2 ) 



(x + y) n = Y. 

p = 0 



x n ~ p y p 



where we have put 0! = 1, pi = 1.2 p and 



5(5 — 1) . . . (s — p + l)/p! for s E C, p G N, 



whence 



s\/{s—p)\p\ if 0 < p < s 
0 if p > s 



for s G N. 



The very simple proof comes from multiplying the right hand side of the 
formula corresponding to the exponent n by x+y and checking that the result 
is the formula corresponding to the exponent n + 1 (proof by induction). 
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The binomial formula can be written in the form 

( 1 . 4 ) ( x + y) n /n\= x p y 9 /plq\ 

p+q=n 

where the is taken over all the ordered pairs of integers p, q e N such that 
p + q = n. If one writes generally 

( 1 . 5 ) = x n /n\ 

( divided powers ), one then has 

( 1 . 6 ) (x + y)^ — ^ x^y^. 

In this form, the relation extends to a sum of any number of terms; for 
example, 

(1.7) (x + y + z + u)^ = x^y^z^u^ 

p+q+r+s=n 

where, here again, the means that one must give the letters p, g, r, s all 
positive integer values such that p + q + r + s = n and then calculate the 
sum of all the corresponding expressions x^y^z^u^ . In a relation such as 
(6), the letter n denotes a fully determined integer, while the letters p, . . . , s 
denote bound variables , or phantoms, whose only function is to serve as a 
logical link between the symbol an d the monomial One can 

of course denote them by letters other than p, . . . , s so long as one does not 
use the letter n, which has a totally different sense; an expression such as 

n 

n= 1 

is meaningless. 

A second group of formulae involves the order relation x <y in 1 or Q: 

(II. 1) the relations x < y and y < z imply x < z; 

(11. 2) the relation {(x < y) & (y < x)} is equivalent to x — y\ 

(11.3) for all x and p, one has x < y or y < x\ 

(11. 4) the relation x < y implies x 4- 2 < y + 2 for all z; 

(11. 5) the relations 0 < x and 0 < y imply 0 < xy. 

Next comes Archimedes’ axiom : 

(III) given x, y > 0 there exists an n G N such that y < nx. 

As we said above, one could, in the preceding, replace M by the set Q of 
rational numbers. The fourth axiom, not true for Q, characterises the real 
numbers; in this form or in equivalent forms, it is indispensable to proving 
that there is something nontrivial to analysis. One can state it in several 
equivalent ways, for example: 
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(IV) Let E be a nonempty set of real numbers . Suppose that there exist num- 
bers M £ R such that x < M for all x £ E. Then the set of these 
numbers possesses a least element 

This least of all the majorants or upper bounds M of E is called the least 
upper bound of E. 

If for example E is the set of truncated decimal expansions for 7r, this 
number will be n itself. The necessity of axiom (IV) will become clear in n° 9 
if it is not yet so at this stage. 

If one defines the real numbers by means of cuts as explained at the end 
of n° 0, then axiom (IV) becomes a theorem, like the others, even almost 
trivial. For let E be a set of real numbers, positive (for simplicity), and 
bounded above. By definition, every x £ E is a subset of Q + satisfying the 
conditions (a), (b) and (c) of n° 0, the relation x < y being, by definition, 
equivalent to the inclusion relation x C y. The least upper bound of E is then 
the union u in Q + of all the sets x £ E. It is indeed obvious that the set u 
satisfies the conditions (a), (b), (c) stated at the end of n° 0, that u > x (i.e. 
u D x) for all x £ E, and that y > u (i.e. y D u) for all majorants y of E , i.e. 
for every cut such that y D x for all x £ E; this is even almost the definition 
of a union of sets: the smallest set which contains them all. 



2 — Inequalities and intervals 

The handling of inequalities is absolutely fundamental in analysis, since they 
govern the approximation calculations in constant use. We shall not prove 
them in detail: everyone knows them, and sceptics, if any there are, may 
find their proofs in the first volume of Dieudonne’s Treatise on Analysis , 
(Academic Press, 1960) for example. 

To avoid confusions with the anglo-saxon textbooks, as one calls them 
in the Quai d’Orsay, let us be very careful to make clear that for us the 
relation x > 0 means that x is positive (in the wide sense), while the relation 
x > 0 means that x is strictly positive. The anglophones say non negative 
and positive , the Germans say positiv for x > 0 and negativ for x < 0, 
the number 0 being, for them, neither the one nor the other. Despite the 
hymns to “the French exception” , the French will probably end up imitating 
the Americans, since roughly 60% of global mathematical production is in 
English, and America provides about 40% of the total. But as a contemporary 
sociologist has remarked, one cannot change society by decree. 

The first essential point is the triangle inequality 

(2.1) \x + y\ < \x\ + \y\, 

valid for all complex x and y, and obvious geometrically (one can also prove 
it . . .). One can generalise it to 



( 2 . 2 ) 



\x\ + . . . T x n \ < |#i| + . . . T \x n \ 
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and deduce that 

(2.3) |(xi + . . . + x n ) - (yi + . . . + y n )\ < |a?i - 2 /i| + • • • + \x n - y n \ 

for all complex x and y: if you want to calculate a sum of 100 numbers to 
within 0.1, it is prudent to calculate each term to within 0.001. 

If one defines the distance between two complex (or real!) numbers x and 
y by the formula 

d(x,y) = \x-y\ 

whose geometric origin is quite clear, the triangle inequality amounts to say- 
ing that 

(2.1’) d(x, y) < d(x, z) + d(z, y) 

for all x,y,z G C. We shall often use the notation d(a, b) to prepare the reader 
for the extensions of analysis to functions of several variables, i.e. defined on 
a subset of a real vector space of finite dimension such as R p , or to much 
more general spaces (Appendix to Chap. III). 

Archimedes’ axiom might appear obvious: one can, like John D. Rock- 
efeller, amass a billion 1910 dollars, at five grams of gold to the dollar, by 
saving one dollar a day for a sufficiently long time. But this does not follow 
from (I) and (II): the professional mathematicians, in general more compe- 
tent than the amateurs as in all the sports, have long since invented strange 
“totally ordered nonarchimedean fields” which satisfy (I) and (II) but not 
(III). One does not meet them in our usual mathematics. 

One of the consequences of (III) is that if a real number x satisfies x < y 
for all strictly positive p, then x < 0, for if not there would exist an integer 
n such that nx > 1, whence x > y with y = 1/n > 0. Other formulations: 

(IIP) for all a, b E R with a <b, there exists a u £ Q such that a < u < b. 
(Ill”) for all a eR and r > 0, there exists an x E Q such that d(a,x) < r. 

Let us first prove (III”), which is obvious if one accepts the decimal ex- 
pression for real numbers. If not, one proceeds as follows. The late lamented 
Archimedes provides us an integer p > 0 such that 1/p < r; one can thus 
restrict to the case where r = 1/p. The inequality to resolve can be written 
| pa — px | < 1, i.e. 



b-l<y<b + l, 

where b — pa and where y = px is neither more nor less rational than x. We 
shall even show that, in this case, one can choose y in Z. If b > 0, there are 
integers n such that b < n; the least of these then satisfies n — 1 < b < n, 
whence b — l<n<6 + las desired. If b — 0, one takes y = 0. If b < 0, one 
reduces to resolving b r — 1 < y f < b' + 1 on putting b f — — b > 0 and y f = —y. 
Whence (III”). 
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To establish (IIP), one puts c = (a + 6)/2, r = (6 — a)/ 2, and any u G Q 
satisfying |c — < r will do. 

It goes without saying that there always exist not just one, but infinitely 
many, rational numbers between a and b : choose an x G Q between a and 5, 
then an x' G Q between a and x, then an x" G Q between a and x', etc. 



In the sequel we shall constantly need to speak of intervals in the set R 
of real numbers. Given two numbers a and 6, one denotes by 





\a,b\ 


the interval 


a < x <6, 




[a,b[ 


the interval 


a < x < 6, 




}a,b] 


the interval 


a < x <6, 




}a,b{ 


the interval 


a < x <b. 


These intervals (which may be empty if a > b) differ one from the other 
only in so far as they do, or do not, contain their endpoints. We shall also 
sometimes employ a notation such as ]a, b) to denote an interval “open on 
the left” but possibly open or closed on the right, optionally. For every real 
number a, one also denotes by 




[ a, +oo [ 


the interval 


a < x, 




] a, +oo[ 


the interval 


a < x, 




] — oo, a] 


the interval 


<3 

VI 




} — oo, a\ 


the interval 


x < a. 



Finally, one sometimes denotes R by the analogous notation ] — oo, +oo[. 

The intervals of the form [a, &], [a, +oc[, ] — oo, a] are called closed ; those 
of the form ]a, b[, with a and 5 possibly infinite, are called open , the whole 
interval R =] — oo, +oo[ being simultaneously open and closed. Finally, the 
intervals of the form [a, b] with a and b finite are called compact. Later we 
shall define much more general open, closed and compact sets. 

As tb the exact meaning of the symbols Too and — oo, used above or 
elsewhere, one must be clear 9 that (my italics) 

(1) oo by itself means nothing, although phrases containing it sometimes 
mean something, 

(2) that in every case in which a phrase containing the symbol oo means 
something it will do so simply because we have previously attached a 
meaning to it by means of a special definition. 

9 Here I quote G. H. Hardy, Pure Mathematics (Cambridge University Press, 1908, 
Tenth ed., 1963, p. 117); it would be difficult to put it better. 
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3 — Local or asymptotic properties 

As the reader will observe, one constantly has to examine functions of a single 
variable - it may vary on an interval of R, or only on N, or on some subset 
of R or of C, just to restrict ourselves to functions of a single real or complex 
variable - when the variable is either “very close” to a fixed value a where 
the function is or is not defined, or is “very large” i.e. “very close to infinity” . 
One might like to know for example that x 2 is equal to 1 to within 0.00001 
provided that x is “sufficiently close” to 1, that (2x 4- l)/(x — 1) is equal to 
2 to within 0.001 provided that x is “sufficiently large” or that 1/n 3 is less 
than 10“ 100 when the positive integer n is “sufficiently large”; one also has 
to know that, for x close to 0, x s is “negligible” in comparison to x, that for 
n very large 10 10 °n 2 -f io 100000 n is “of the same order of magnitude” as n 2 , 
etc. It is easy to give a perfectly clear meaning to these a 'priori rather vague 
expressions. 

It is best to consider generally an assertion P(x) in which there appears 
a letter x (or y, or n, or p, or some other) supposed to represent a number 
varying in a given set E of real or complex numbers; for example, the relations 

\x 2 — l| < 0.00001 where x G E = R, 

\(2x + l)/(x — 1) — 2| < 0.001 where x G E = R — {1}, 

1/n 3 < 1/10 100 where n G E = N — {0}, 

n 2 < 10 10 °n 2 + 10 looooo n < (lO 100 + l) n 2 where n G E = Z. 

In the case of R, we shall say that P(x) is true for all sufficiently large 
positive x G E (or, for short, true for x large) if there exists a number M 
such that, for x G E 1 the relation x > M implies P(x). There is an analogous 
definition for x sufficiently large and negative. If one does not specify “posi- 
tive” or “negative”, then we mean that P(x) is true for |x| > M. In the case 
of C, where these inequalities have no meaning, one says that P(x) is true 
for large x if there exists a number M > 0 such that |x| > M implies P(x): 
in other words if P(x) is true outside a sufficiently large disc. 

Given a number a G R or C, we say similarly that P(x) is true for all 
x G E sufficiently close to a or, more briefly, that P(x) is true on a neigh- 
bourhood of a, if there exists a number r > 0 such that 

(3.1) {[d(a,x)<r\ & (x G E)} =>• P(x); 

in plain language: P(x) is true for all x G E such that \x — a\ < r. Another 
formulation: we call 10 an open ball with centre a any set B(a,r) defined by 

10 The use of the word “ball” rather than the word “interval” (in the case of R) 
or of the word “disc” or “circle” (in the case of C) is justified by the case of 
functions of several variables, i.e. defined on a subset of a Cartesian space R p . 
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an inequality d(a, x) < r with a strictly positive r, in other words, in the case 
of M, any interval ]a — r,a + r[ with r > 0 and, in the case of C, any disc of 
centre a, the circumference excluded. Then (3.1) means that there exists an 
open ball B(a,r) = B with centre a such that 

x e B n E => P(x). 

The fact that we have chosen “open” balls here, i.e. defined by a strict in- 
equality d(a,x) < r, rather than closed balls defined by a weak inequality 
d(a, x) < r, is of no importance: an open ball with centre a and radius r con- 
tains every closed ball with the same centre and radius r' < r, and vice-versa; 
a relation valid in an open ball will also be valid in a smaller closed ball and 
vice-versa. 

Most authors, following a tradition that goes back, at least, to the famous 
courses in analysis delivered by Karl Weierstrass in Berlin around 1870, use £ 
or S for what we call r, the psychology of this notation being that these letters 
are reserved for “very small” numbers; it is obvious that if one can verify that 
the assertion P(x) is true once d(a 1 x) < 10~ 4 it is unnecessary to examine 
what happens in the ball d(a, x) < 1000. This usage, however, probably stems 
from the fact that 5 is the initial letter of the word “difference” and that e 
immediately follows 5 in the Greek alphabet; we realise this when we define 
the continuity of a function at a: 

for any e > 0 there exists a 5 > 0 such that 
\x - a\ < 8 => I f(x) - /(o)| < e, 

or again: if the difference between x and a is sufficiently small, i.e. smaller 
than a suitably chosen number 6 > 0, then the difference between f(x) and 
/(a) is also as small as one wishes, i.e. smaller than any number £ > 0 
given in advance. The use of the letter 6 (apparently begun by Cauchy) is 
thus relatively rational, and that of the letter £ (introduced by Weierstrass) 
probably followed for a reason quite unconnected with mathematics. 

In any case, r (or £, or 5) may also be “very large” since one asks no more 
than they exist; further, the concept of a “very small” or “very large” fixed 
number has no objective meaning. There could be no objection if the reader 
followed a preference for the letter R , or p, or whatever he wanted, a □ or $ 
sign for example: in a statement such as (1), as in the notation ^ in n° 1, 
the letter r is a phantom, a “bound variable” , whose only role is to serve as 
the logical link between the assertions “there exists r > 0” and “\x — a\ < r 
implies P(x)” . We prefer the letter r because it suggests the radius of a ball; 
moreover, it is directly available on all typewriter and computer keyboards. 

One might also restrict oneself to powers of 10 and say: 

there exists an n E Z such that P(x) for all x G E satisfying 
\x-a\ < 10~ n . 
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In the language of decimal approximations: there exists an n such that the 
property P(x) is true provided that the first n decimal places of x agree with 
those of a. The equivalence with (1) can be seen from the observation that, 
first, the powers of 10 are among the numbers r > 0, secondly, that if (1) 
is satisfied for an r which is not a power of 10, it will be satisfied a fortiori 
if one replaces r by a number 10 -n with n sufficiently large that 10~ n < r. 
From time to time we will give translations into this language. 



The fundamental point to remember in these modes of expression is that, 
if one has a finite number of assertions Pi (x ) , . . . , P n (x) , and if each of these 
assertions, taken separately, is true on a neighbourhood of a given point a, or 
for x sufficiently large, then so is the logical conjunction of the given relations; 
in other words, they are simultaneously true on a neighbourhood of a, or for 
x sufficiently large. Indeed, in the second case there are numbers Ai,...,A n 
such that Pi(x) is true for all x > Ai, so that the n assertions considered will 
simultaneously be true when x exceeds the largest of the numbers Ai. In the 
first case, there are open balls 2?(a, r*i), . . . , Z?(a, r n ) with centre a in which 
the corresponding assertions are valid; they will thus be simultaneously valid 
in the intersection of these balls, which is the open ball B(a,r ) whose radius 
is the least of the radii of the balls considered. 

Contrariwise, the intersection of infinitely many open balls with centre a 
(resp. of infinitely many intervals of the form ]A, +oo[) can very well reduce 
to a point a (resp. be empty): this is the case of the balls £?(0, 1/n), n G N, by 
Archimedes’ axiom. We shall prove later that, in M, every intersection of in- 
tervals is again an interval, possibly empty, but the intersection of an infinite 
number of open intervals need not again be an open interval; the intervals 
] — 1/n, 1/n [ provide a counterexample. 



These concepts are particularly useful when one wants to compare the 
“orders of magnitude” - a naive expression having no mathematically precise 
meaning 11 - of two scalar functions f(x) and g(x) when the variable increases 

11 This is not the same as in physics, where an “order of magnitude” means a 
factor 10. Example: the power of an atomic bomb is “three orders of magnitude” 
(i.e. 10 x 10 x 10 = 10 3 ) greater than that of a classical explosive. One of the 
experts in the subject even claims that the prestige attached to the megatonne, 
or to a million dollars, is linked to the fact that men have ten fingers; Herbert 
York, Race to Oblivion (Simon Sz Schuster, 1970), pp. 89-90: “We picked a one- 
megaton yield for the Atlas warhead for the same reason that everyone speaks of 
rich men as being millionaires and never as being tenmillionaires or one-hundred- 
thousandaires. It really was that mystical, and I was one of the mystics. Thus, 
the actual physical size of the first Atlas warhead and the number of people 
it would kill were determined by the fact that human beings have two hands 
with five fingers each and therefore count by tens”. The committee entrusted 
with deciding the characteristics of the Atlas in 1953-1955 was chaired by J. von 
Neumann. York’s book is a sparkling exposition of the American contribution to 
the arms race before 1970. 




62 



II - Convergence: Discrete variables 



indefinitely or approaches indefinitely close to a limit value a; what then 
matters is the ratio \f(x)/g(x)\, which may, when x tends to infinity or to a, 
either take values as large as one wants, or remain less than a fixed number, 
or remain confined between two fixed strictly positive numbers, or approach 1 
more and more closely, or approach 0 more and more closely, not to speak of 
the cases where nothing of this kind happens. 

These comparisons can be expressed with the aid of notation which we 
shall use frequently later and that it is important not to confuse. There are 
four cases to consider. 

(i) The relation 

f(x) — 0(g(x )) when x — > -boo (or when x — > a), 

with an upper case O, means that there exists a number M > 0 independent 
of x such that one has \f(x)\ < M\g(x)\ for x large (or for x close to a) in 
the sense we have given above to these expressions. The notation 0(g(x)) 
is used to denote not only a particular function /, but also an arbitrary 
function 0(g(x)). Experience shows that the ambiguities thus introduced 
have no unfortunate consequences if one keeps this convention in mind. For 
example, the relations fa = 0(g ) and fa = 0{g) do not imply fa = fa 
despite what one might believe at first glance. Similarly, the obvious relation 
0(g{x)) -\-0(g(x)) = 0(g(x)) - obvious since if two functions are, for x large, 
majorised by h\g{x) \ and I2\g(x) \ respectively, then their sum is majorised by 
17\g(x)\ - does not imply that 0(g(x)) — 0. We shall return to these points 
in detail in Chap. VI. 

For example 



10 100 n 2 + io 100 000 n = 0(n 2 ) when n -> +oo 
since for n > 1 (whence n < n 2 ), the left hand side is less than Mn 2 with 

M= IO 100 + io 100000 ; 

this number may appear “very large” to the puny members of the human 
race, but it is independent of n and one does not demand more. 

(ii) The relation 

f(x) x g(x) when x — ► -boo (or when x — > a) 
means that there exist numbers m > 0 and M > 0 such that 
m\9(x)\ < \f{x)\ < M\g{x)\ 

for x large, or for x close to a. One says / and g are comparable or have 
the same order of magnitude in these circumstances. This is equivalent to 
requiring that f — 0{g) and g = 0(f) simultaneously. 

For example, 
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x 4- sin x x x when x — » Too 

since the left hand side lies between x — 1 and x + 1, so between x/2 and 2x 
for x > 2. 

(iii) The relation 

f{x) g(x) when x — » Too (or when x —+ a) 

will be defined in the next n° , as will be 

(iv) The relation 

f(x) = o(g{x )) 

with a lower case o. These two relations presuppose the concept of limit as 
already known. 

4 — The concept of limit. Continuity and differentiability 

At our present level the concept of limit applies to complex- valued functions 
defined on a subset X of C (and in particular of R) when the variable x £ X 
increases indefinitely or else approaches a value a G C indefinitely closely. If, 
for example X C R, the relation 

lim /(x) = u 

x — »+oo 

means that, for all r > 0, one has | /(x) — u\ < r for x large, in other words 
that, for every r > 0, there exists a number N (depending on r) such that 

(4.1) x > N ==> d[f(x),u] < r. 

The limit when x tends to — oo is defined analogously: we replace the condi- 
tion x > N by x < N. (One makes no assumption as to the sign of N ). In 
the complex case where these inequalities mean nothing one clearly needs to 
write that 

|x| > N =>■ | /(x) — u\ < r. 

Similarly, the relation 

lim /(x) = u 

x—xi 

means that, for all r > 0, one has 

(4.2) d[f(x),u] < r for all x G X sufficiently close to a 

i.e. that there exists a number r f > 0 (depending on r) such that, for x G A, 
the relation 

(4.3) |x — a\ < r f implies | /(x) — u\ < r. 

Note in passing that we do not assume that a E X: one can have X =]0, 1[ 
and a — 0. For example, in A = C, 1/z tends to 0 when z — ► oo since the 
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inequality \l/z\ < r is satisfied as soon as \z\ > 1/r. Similarly, x 2 tends to 4 
when x tends to 2, since on the one hand 

\x 2 — A\ — \x — 2|.|x + 2| < 5|x - 2| for |x — 2| < 1, 

while on the other hand, 5|x — 2| < r for \x — 2| < r / 5; so \x 2 — 4| < r once 
|x — 2| < r' = min(l,r/5). 

We shall hardly ever, in this chapter, use the concept of limit except for 
the case of sequences, i.e. of functions where the independent variable takes 
only integer values. The general case which we have just mentioned will be 
covered in detail in Chap. III. Since we shall nevertheless need to speak 
occasionally of continuity and of differentiability in this chapter, let us now 
give the definitions of these two fundamental properties. 

A scalar (i.e. complex-valued) function / defined on a set X C C is said 
to be continuous at a point a of X if 

lim/(x) = /(a) when x — > a. 

This means that for all r > 0 there exists an r' > 0 such that 

(4.4) {(zeX) & (|x-a| </)} => \f(x) -/(a) | <r 

or again that, for all r > 0, f(x) is constant to within r on a neighbourhood 
of a in X or again, in decimal language, that if one wants to calculate f(a) 
to within 10~ n it suffices to calculate f(x) for any x E X having sufficiently 
many decimal places in common with a, for example the number obtained 
by replacing all the digits of a of sufficiently high rank by 0; this is what all 
the practitioners of numerical analysis have always done, and this is what 
all computers do. The calculation above, showing that x 2 tends to 4 when x 
tends to 2, expresses the continuity of the function x i — ► x 2 at x = 2. 

More generally, let us show, as a useful exercise, that the functions x n , or, 
as clearly amounts to the same, f(x) = = x n /n!, are continuous on C. It 

suffices to show - the other formulation of continuity - that, for x given, the 
difference | f{x + h) — f(x) | is < r for \h\ sufficiently small. Now the binomial 
formula (1.6) shows that 

\f(x + h)-f{x)\ = 

(4.5) = | x^~^h + x^-^h 2 / 2! + . . . + h n /n\\ 

< |fc|.{|x| [n-1] + |x|["- 2 1|/i|/2! + . . . + \h\ n ~ 1 /n\}. 

The expression between the braces { } differs from the expansion of (|x| + 
|h|)[ n ~d only in the presence, in the term in |/i| p , of a denominator (p + 1)! 
instead of pi; since (p + 1)! > p!, one concludes that 

(4.6) | {x + h) [n] - x [n] \ < |ft|(|x| + |/i|) [ ”“ l! ; 

since \h\ is a factor of the right hand side, continuity is proved, since, 
for \h\ < 1 for example, the right hand side is bounded by M\h\ where 
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M = (|x| 4- l)[ n_1 i does not depend on h. The inequality (6) will serve us 
on other occasions in proving the continuity and the differentiability of much 
more general functions. 



The derivative of a function / at a point a is defined similarly, as the limit 
of the ratio 

f{x) - f(a) 
x — a 



as x tends to a remaining ^ a, i.e. tends to a in the set X' = X — {a} obtained 
by omitting the point a from the set of definition X of /; or again, by the 
traditional formula 



(4.7) 



f(a) = lim 

h — >0 



f(a + h)~ f(a) 
h 



where one clearly has to impose on h the conditions which make the quotient 
meaningful: h ^ 0 and a -j- h G X. In practice, X is either an interval of R 
containing a, or, in the case of a function of a complex variable, a subset of 
C containing an open ball with centre a. This last case, appreciably more 
subtle than the first, will arise in n° 19 a propos so-called analytic functions, 
but, here, it is no more difficult to understand than that of functions of a 
real variable. 

For example let us calculate the derivative of the function f(pc) = x ^ for 
n G N. By the binomial formula, one has 



f(x + h)= iH + x [n ~ 1] h + + . . . , 



whence 



[f(x + h)-f(x)\/h = J n ~ 1 ]+... 

where the terms omitted represent, for x given, a polynomial in h of degree 
n — 1 whose term independent of h is zero. It is almost obvious - and the 
simpler rules of calculating limits will confirm this if the reader is not yet 
fully convinced ... - that this polynomial tends to 0 with h, so that in the 
limit one obtains the formula 

(4.8) (x^y = x [n ~ 1] 

or, in more traditional notation, 

(4.8 bis) {x n ) f = nx n ~\ 

One should note that this calculation is as valid in C as in R. 

Here again, it is helpful to refine the previous calculation a little. If one 
imitates (5), one obtains 

| f{x + h)~ f(x) - x [n - 1] h\ = \x^- 2 ^h 2 /2\ + . . . + h n /n\\ 

< |/i| [2] ||x| [ " -21 + |x|l n - 3 l|/i|/l! + . . . + | h\ n ~ 2 /(n - 2)!} 
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since p\ > 2\{p — 2)!, whence the inequality 

(4.9) |(x + /i)M -xW -hx^-^\ < \h\M (\ x \ + \h\) [n ~ 2] 

similarly to (6), or 



(x + - xN 



_ r l«-i] 



h 



<^(\x\ + \h\)^<M\h\ 



for, say, \h\ < 1. We then get (8) by letting h tend to 0. More generally, 



(4.10) 



(x + h)W - X M _ hx [n ~ l] - ... - h^x^^ 

< |/i| [p+1I (|x| + |/!.|) [ "“ p " 1] . 



The concept of a limit now allows us to define the comparison relations 
f{x)~g(x), f(x)=o(g(x )) 

which we abandoned to their fate at the end of the preceding n°. The first 
- one says that the functions f(x) and g(x) are equivalent at infinity or on 
a neighbourhood of the point a - means that the ratio f(x)/g(x) tends to 1 
when x tends to the limit value considered. The second means that this same 
ratio tends to 0; one says that f(x) is negligible with respect to g{x) in these 
circumstances. 

The second relation means that for all r > 0 

|/(x)| < r\g(x)\ 

for x large (or for x close to a), i.e. that there exists an r' > 0 such that 
(4.11) \x - a\ < r f => \f(x)\ < r\g(x)\. 

For example, 

x 2 = o(x) when x —> 0 

since |x| < r implies \x 2 \ < r |x|; so for example \x 2 \ < 10“ 1000 |x| once 
|x| < lO- 1000 . In C, one has similarly 

x — o(x 2 ) when x — > oo 

since the relation |x| < r\x 2 \ is satisfied once |x| > 1/r. 

As to the relation f ~ g, it reduces to the one we have just described. 
One can express this as: for all r > 0, one has 

\f{x)/g(x) - 1| < r 

for x large (or close to a) . But this can be written 
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I f(x) -50)1 < r\g(x)\, 

in other words f(x ) — g(x) = o(g(x)) 1 or again 

(4.12) f(x) = g(x)+o{g(x)): 

f(x) is sum of g(x) and of a function negligible with respect to g when x 
tends to the value considered. Here, as in the case of the notation 0(g(x)) 1 
the notation o(g(x)) is used to denote any function negligible with respect to 
g(x). Some authors introduce indices to avoid confusing functions that are in 
fact distinct oi(g), 02 (g ) , etc. 

For example 

x 2 4- x ~ x 2 when |x| — > 00 

since we have seen above that x = o(x 2 ). 

Another example: to say that a function / defined on a neighbourhood 
of a point a possesses a derivative at a means that there exists a constant c 
such that 

(4.13) f(a + ft) = f(a ) 4 - eft 4- o(h) when ft — ► 0; 
for this relation means that, for all r > 0, 

I f(a + h) - f(a) - ch\ < r\h \ , 



f{a + h) - /(a) „ 

ft 

for | ft | sufficiently small, in other words that / possesses a derivative f'(a) = c 
at a. For example 



sin x = x 4- o(x) ~ x when x — » 0 

since the ratio sinx/x tends to the derivative of the function sine for x = 0, 
i.e. to cosO = 1. 



5 — Convergent sequences: definition and examples 

Let us return to the principal topic of this chapter: convergent sequences. As 
we said while explaining Set Theory, a sequence of elements of a set E is a 
function defined for all integers n > 1 (or, more generally, for all sufficiently 
large n E Z) with values in E ; the value of this function for x — n might, 
for example, be written /(n), but the tradition has it that it is better to 
write u n and to employ the notation (u n ) to denote the succession of values 
U \ , U 2 -, • • • of the terms of the sequence. This of course does not prevent us 
from using functional notation u(n) when that appears more convenient, 
particular for typists, a category to which mathematicians have belonged for 
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a long time 12 . Having recalled this, one says that a sequence (u n ) of complex 
numbers converges or is convergent if there exists a number u, the limit of 
the sequence, such that, for all r > 0, 

(5.1) d(u,u n ) < r for all sufficiently large n, 

in other words if, for all r > 0, there exists an integer N (generally depending 
on r, and unless u n — u for all sufficiently large n) such that 

(5.1’) | u — u n | < r for all n> N. 

This definition is only a particular case of the general concept defined at the 
beginning of n° 4: X is here the set of n G Z for which u n is meaningful. 

It would clearly be enough to verify (1) for numbers r of the form 1/p, or 
l/10 p or, if one is a fan of the binary system, 1 /2 P 1 since one can always choose 
p so that, for example, 1/lCF < r. If the sequence has real terms, this would 
indicate that, for all sufficiently large n, the decimal expansion of rank p of u n 
is identical to that of u. Even though this idea is right, this formulation is not 
entirely correct because of the eccentricities of representation: the sequence 
whose terms successive are 

0.9 1.1 0.99 1.01 0.999 1.001 etc. 

obviously converges to 1, but contradicts the hypothesis of “asymptotic sta- 
bility” of the decimal expansions of given order for terms of a sequence. 




In the case where all the u n are real, one can represent the sequence (u n ) 
by a piecewise linear graph (figure 1) joining the different points (n, u n ) in 

12 Indeed, since one can get the majority of scientists to do almost anything by 
challenging their abilities, the use of very sophisticated mathematical word pro- 
cessors has transformed many mathematicians into voluntary quasi-professional 
typographers (we repeat: quasi) - to the greater benefit of the true professionals 
thus displaced . . . 
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the plane. Convergence of u n to u then means that this graph is “asymptotic” 
to the horizontal line y = u in the plane (not by the definition of a limit - 
one has no need to appeal to geometry for this - but, quite the contrary, by 
the definition of an asymptote). A similar remark applies to the case of a 
function f(x) defined for sufficiently large xGl and which tends to a limit 
when x increases indefinitely: the fact that the function 1/x tends to 0 at 
infinity shows that its graph is asymptotic to the x axis. 

When a sequence (u n ) tends to a limit u, one may write 

lim u n — u or lim u n — u 

n— > oo n— >+ oo 

or even simply limw n = u. Hardy’s admonitions on the significance of the 
symbol oo apply here in full. Moreover, like the letter r in the last n°, the 
letter n is only a phantom and one can replace it by any other sign, on 
condition that one does so everywhere; one may very well write 

lim u($) = u, 

$ — XX) 

the mathematics will not change; however, it is forbidden to keep the first 
sign $ and to replace the second by the sign £ since different letters a pri- 
ori represent variables independent one of the other: one would then have 
limu(<£) = u{£ ), unless one specifies this by a relation such as £ = /($). 

It is clear from the definition that for a sequence (u n ) to tend to a limit u 
it is necessary and sufficient that the sequence with general term u — u n tends 
to 0, which one can write in the form 

u n = u + o(l) when n — » -f oo 

since the symbol o(l) represents any sequence or function negligible with 
respect to the constant function 1, i.e. tending to 0. 

Moreover, if u n = v n +iw n is a sequence of complex terms, the inequalities 

\v n - u|, I w n -w I < I u n - (v + iw ) I < \v n - v\ + I w n - w I 



show that 

lim(u n 4- iw n ) — v + iw (lim v n = v) & (lim w n = w). 

The inequality \\u n \ — \u\\ < \u n — u\ shows on the other hand that, for real 
or complex sequences, 



limu n = u => lim \u n \ = \u\. 

The converse is clearly false. 

The most obvious example of a convergent sequence is obtained - and it 
is a long way from being by chance - by starting with a real number u and 
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denoting by u p its truncated or default decimal expansion of rank p ; this is a 
number of the form kj 10 p , where k is an integer such that 

u p < u < Up + 10 -p ; 

with this definition, the default decimal expansion of the number 1 is 
0.9999&C. Then 

d(u, u n ) < 10 -p for all n > p, 

from which u is the limit of the u n . This example makes clear the fact that 
we cannot define the real numbers without using the concept of limit in one 
way or another. It also shows that if one tries to limit the domain of the 
analysis to Q there will be multitudes of sequences which will not converge 
because of the fact that their limit is irrational. 

We mentioned above that when a sequence (u n ) of real numbers converges 
to a limit u, the decimal expansion of the number u n has a strong tendency 
to stabilise as n increases indefinitely. One might deduce that a convenient 
experimental method to ascertain convergence, or to calculate the limit, of a 
sequence is to examine numerically a sufficiently large number of terms. This 
can spring several surprises on modern programmed calculators. 

If one considers for example the sequence with general term 

Un = ( 1 + 1 /n) n , 

which, we shall see later, converges to the number 

e = 2.71828 18284 590..., 

the base of the natural logarithms, one finds that 

u A = 2.44141 . . . , u 6 4 = 2.69734 . . . , u 102 4 = 2.71696 

which indicates that u n approaches its limit value only very slowly, and that 
one would have to choose enormous values of n to obtain even ten or so 
decimal places of the number e; very luckily, the sequence with general term 

l + l/2! + l/3! + ... + l/n! 

also converges to e, but with prodigious rapidity since the term l/(n + 1)1 
which one adds to u n to obtain u n+ i becomes microscopic very quickly. 

Another example of a slowly convergent sequence: 

u n = 1/1.2 + 1/3.4 + 1/5.6 + . . . + l/(2n - l)2n 

= 1 - 1/2 + 1/3 - 1/4 + ... + l/(2n - 1) - l/2n. 

It was already known by the end of the XVII th century that to calculate the 
limit, namely log 2, exactly to 9 decimal places, quite a modest precision, one 
must choose n > 10 8 ; see n° 13 for alternating series. 
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The sequence 

u n = 1 4- 1/2 -f 1/3 + . . . + 1 /ti, 

is divergent or, what comes to the same for an increasing sequence, its terms 
increase above any bound as we shall see in n° 7. Even on choosing an index n 
as hyperastronomic as lO 100 , one finds a value only of the order of 230, a result 
which, numerically, is perfectly compatible with the false hypothesis that the 
sequence converges to 231. 

Mathematicians such as Newton, Stirling or Euler who worked to obtain 
these numerical estimates had methods more . . . intellectual than that of 
churning for an undetermined time (the experts will estimate it for us) the 
“acres of computers” of the American National Security Agency , at the risk 
of compromising the said security for the idle amusement of mathematicians. 
They already have trouble in finding two prime numbers p and q when the 
product pq, all one tells them, has a hundred or so digits. 

To conclude these generalities, let us observe that the definition of a limit 
supposes that the limit to be obtained is known. It is nevertheless possible 
to decide on the convergence of a sequence without knowing the limit in 
advance. In this direction one notes that if one has d(u,u n ) < r/2 for all 
n > N, one will also have, by the triangle inequality, 

d(u p , u q ) < r once p > N and q > N. 

We shall show in the following chapter that this necessary condition for con- 
vergence is also sufficient; this is Cauchy’s general criterion of convergence , 
known before him to Bolzano, and which neither really proved. One can make 
the result seem very plausible by choosing numbers r of the form 10~ n ; if 
decimal numeration did not exhibit the bizarre behaviour to which we have 
already alluded, the preceding inequality would show that starting from a 
certain rank, the first n digits of the terms of the sequence would no longer 
change, and this would demonstrate convergence. 



Let us now give some examples of convergent sequences; the first are 
almost trivial, but those following will be very useful in the sequel. 

Example 1. The “constant” sequence u,u,u,... converges to u. 

Example 2 . One has 



(5.2) liml/n = 0, 

since the relation |l/n| < r can be written as nr > 1 so is satisfied for large n 
by Archimedes’ axiom. 



Example 3. One has 



lim ■ 



n + 1 



= 1 



since |1 — n/(n + 1)| = l/(n + 1) tends to 0 by the preceding example. 
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Example 4- The sequence u n = (— l) n + l/n does not converge; its terms of 
even order tend to 1, its terms of odd order to —1. 

We note in this connection that if a sequence (u n ) converges to a limit u, 
then every subsequence that one can extract from it will converge, also to u. 
Such a subsequence is obtained by choosing an increasing sequence of integers 
Pi , £> 2 , etc. and omitting the terms of the initial sequence except for those 
corresponding to this choice. Convergence then follows from the obvious fact 
that p n > n for all n. 

For example, the sequence with general term 1/n 3 tends to 0, since it is 
a subsequence of the sequence of example 2. 

Example 5. If q is a complex number, then 

(5.3) lim q n = 0 if \q\ <1. 

We need to show that, for any r > 0, one has \q n \ < r for n large. By replacing 
q by |g|, one reduces to the case where q > 0, and even to q > 0, since the 
case where q — 0 is trivial. As we are now assuming that q < 1 , we have 
1 = q + 1 with t > 0. All the terms in the binomial formula 

1 = (q + t) n+1 = q n+1 + (n + 1 )q n t + . . . 

are positive, whence (n + 1 )q n t < 1, or 

0 <q n < l/t(n + 1). 

The right hand side clearly tends to 0, so q n does too. 

For q — 1, the limit is clearly 1. For all other possible values of q the 
sequence is divergent. For suppose that limq n = u exists for some number 
q G C. The sequence with general term g n+1 also converges to u, since it is 
a subsequence. But q n+1 = q.q n , and it follows from the definition that, for 
every convergent sequence, 

limw n = u implies \miq.u n — q.u 

for any q G C, since if q ^ 0, the only nontrivial case, 

I qu n — qu\ < r | u n - u\ < r/\q \ , 

a relation which is satisfied for large n since r/\q\ > 0. Returning to the 
sequence considered, we must have qu = u, i.e. either q — 1, the trivial case, 
or u — 0, which cannot be the case for |<?| > 1 since then \q n \ > 1 for all n. 
We therefore have divergence except for the cases \q\ < 1 and q = 1. 



Example 6. Let us show that 

(5.4) limz n /n! = limz^ = 0 
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for any z £ C. By taking absolute values we can restrict to the case where 
z > 0. Writing u n for the general term, we first remark that, for n > p, we 
have 

z p z n ~ p z z 

Un — 17 TT7 — u p ’ i 7 * • • — • 

pl(p + 1) . . . n p 4-1 n 

Let us choose for p the integer part of lOz, so that p < lOz < p + 1, and keep 
p fixed. We have z/q < 1/10 for q>p+l, and the relation above shows that 

u n < u p /10 n ~ p = 10 P u p /10 n for all n > p, 

whence u n < r as soon as n is so large that 10 n > 10 p u p /r, qed. 

Example 7. On writing x 1 ! n — tfx, we have 

(5.5) limx 1 ^ = 1 for all x > 0. 

Suppose first that x > 1, whence x 1 / n = 1 + x n with x n > 0. All the terms 
in the binomial formula 

x = (1 + x n ) n = 1 + n.x n + . . . , 

are > 0, whence 0 < x n < (x — 1 )/n and so limx n = 0, which proves (5). 
The case where x = 1 is trivial. If 0 < x < 1, one puts x = 1/y with y > 1, 
whence x 1 / n = 1/y 1 / n . It remains to show that, generally, 

lim'Un = / 0 implies lim 1 /u n = 1 /% 

which we shall do in a little while. 

This example arose in the construction of the first tables of logarithms 
by Napier and then Briggs, Kepler, etc. The fundamental relation lo g(xy) = 
logx + logy shows that log(x p ) = p. logx, whence 

log(x ly/n ) = log(x)/n. 

Suppose that x > 1 and, as above, put 

(5.6) x 1/n — 1 -f x n , whence 0 < x n < (x — l)/n < x/n. 

Then logx = nlog(l -f x n ), whence 

log(x)/nx n = log(l + x n )/x n . 

When n increases indefinitely, the right hand side is of the form log(l + h)/h 
where h tends to 0. If one assumes that the function log, which clearly satisfies 
log 1 = 0, is very “regular”, it is simplest, so as not to make the calculations 
too intricate, to assume that 

log(l + h) ~ h as h — > 0 
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(or ~ Mh where M is a constant if one works, like Briggs, with logarithms to 
base 10), in other words, that the function log has a derivative 13 equal to 1 
at x = 1. 

In these circumstances, log(l + x n )/x n tends to 1 by definition of the 
derivative, thus log(x)/nx n also, and, since logx does not depend on n, one 
sees finally that logx = limnx n . Taking account of the definition (6) of x n , 
one thus obtains the fundamental formula 

(5.7) logx = limn(x 1//n — 1). 

This seems to be due to the astronomer Halley (1695), although the essential 
ideas are already in Napier and Briggs. 

Exercise. Show that (7) is equivalent to 

x l/n = 1 + ^g£ + 0 (l) asrw+00 . 

n \n J 



To avoid all confusion, we stress the fact that, at this stage of the expo- 
sition, the formula (7) is not, in itself, either a definition or a construction 
of the function log. We have only shown that, if there exists a function log 
satisfying lo g(xy) = logx + logy and such that log(l + x) ~ x when x tends 
to 0, then it is given by the relation (7). But we have as yet proved neither 
the existence of the function log, nor the convergence of the sequence (7). 
This will be the object of Theorem 3 of n° 10. 

In fact, as we shall see later, log(l + u) lies, for 0 < u < 1, between 
u — u 2 / 2 and u. Since 

logx = n. log(l +x n ), 



we have 

(5.8) 0 < nx n — logx < nx 2 n j 2 < x 2 /2 n, 

the last inequality following from (6). In other words, the error committed in 
replacing logx by nx n is < x 2 /2 n provided that x n < 1. It only remains, a 
modest enterprise, to perform the numerical calculations. 

13 This is the crucial point in obtaining formula (7). Napier, who worked at his 
tables from about 1590 to his death in 1617, and Briggs, who transformed them 
into logarithms to base 10 from about 1615 on, did not argue in terms of “deriva- 
tives” for the excellent reason that these would not appear, and then in a rather 
hazy way, until about twenty years later, with Fermat and Descartes, a propos 
the calculus of tangents to a curve. But Napier imagined a point moving along 
a segment of line with a speed inversely proportional to its distance x from the 
origin of the segment, and the concept of “instantaneous speed” was just that of 
derivative with respect to time. The reader should not believe that the concepts 
and ideas which, nowadays, appear simple enough to be taught every year to 
hundreds of thousands of young people of the Earth, were born fully armed from 
certain brains of genius, like Athena from that of Jupiter . . . 
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Since there is no practical method of extracting n th roots numerically (by 
hand . . .) apart from the case where n is a power of 2 - then it suffices to 
extract successive square roots one restricts to integers n of the form 2 P . 
In this case, the error given by (8) is majorised by x 2 /2 p+1 . 

Napier, Briggs after his death, and then Kepler, set themselves, among 
other things, to calculate the logarithms of the first thousand integers exact 
to 7 or 14 decimal places; starting from this, one could find those of a large 
number of fractional values of x, since lo g(p/q) = log p— log q. (Above all, they 
needed the logs of the trigonometric functions, but this is another problem 
again). Of course, they did not have to calculate all these logs directly; it is 
enough, for a start, to calculate those of the prime numbers 2, 3, 5, 7, 11, 13, 
etc. and even though in this case there are “tricks” to reduce the labour, it 
remains considerable, to express it mildly. 

The first candidate is x — 2. One has to extract p successive square roots, 
choosing p and performing the calculations to sufficiently many places to have 
a hope of obtaining the required precision. In this particular case the error 
is less than 2 2 /2 p+1 = 2 -p+1 in addition to those committed in extracting 
the p successive square roots of 2. To obtain the result to 14 places, it is 
thus prudent to choose p so that 2~ pJrl < 10“ 15 , i.e. 2 P > 2.10 15 . Now 
2 9 = 512 < 10 3 < 2 10 = 1024, whence 2 50 > 10 15 > 2 45 , which indicates the 
need to choose for p a number between 45 and 50, in other words to extract 
at least 45 successive square roots of 2 to ensure having 15 places exact at 
the end. One can reduce the work with the aid of the following remark. 

We know, and they already knew then, that 

1 4* u/2 — u 2 /8 <(14- u) 1 / 2 <14- u/2 for 0 < u < 1 

(square it all), so that, for u small, the error committed in replacing the 
square root of 1 4- u by 1 4- u/2 is less than u 2 /S = u 2 / 2 3 ; the error is thus 
< 2~ 50 if u < 2 -24 . Now, in calculating the x n , one has 

1 + x n+1 = (1 4- x n ) 1/2 and x n < 2~ n ~ 1 < 2~ 24 

once n > 25. One thus sees that after having extracted the 24 or 25 first 
successive square roots of 2 to 15 exact places, one may assume that (14- 
u) 1 / 2 = 1 + u/2 for p > 25. In other words, one has to calculate only the first 
25 square roots to 15 places, which is still not within reach of everyone. 

In the case of logs to base 10, Briggs started by extracting 54 successive 
square roots of 10, which gave him the number 14 

1.00000 00000 00000 12781 91493 20032 35 = 1 4 -h 

and allowed him to calculate the number M such that log 10 (l 4 - h) ~ Mh 
since 

14 See E. Hairer and G. Wanner, Analysis by Its History (Springer-New York, 1996), 
p. 30, which also reproduces in facsimile the page where Briggs tabulates the 54 
successive square roots of 10 and their logarithms. 
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1 = log 10 (10) = 2 54 log 10 (l + h) ~ 2 5A Mh. 

To calculate log 10 2 for example, he again extracted 54 square roots of 2, and 
found 

1.00000 00000 00000 03847 73979 65583 10 = 1 + ft' 



and since 

logic 2 = 2 54 log 10 (l + h') ~ 2 54 M/i', 

finally he found 



log 10 2 - ti/h = 0.30102 99956 63881 2. 

This done, you start again with 3, 5, 7, etc. One may well suspect that, as 
we shall see, more economical procedures were invented later. 

We shall return to all this a propos the logarithmic functions. 

Example 8. We have 

(5.9) limn 1/n = l. 

Let us put n 1 / 71 = 1 + x n where clearly x n > 0. The binomial formula 

n = (1 + x n ) n = 1 + n.x n 4- ^n(n - l)x 2 + . . . , 

in which all the terms are > 0, shows that x\ < 2/(n — 1), an expression which 
tends to 0. For all r > 0, one thus has x 2 < r 2 for n large, so 0 < x n < r, 
and thus x n tends to 0, qed. (A sequence which tends to 0 is < r 2 or r 624 for 
n large since r 2 > 0). 

6 — The language of series 

Though not basically different from that of sequences, the language of series is 
frequently convenient, principally because many individual functions can be 
represented more naturally by series than by sequences; their theory scarcely 
appeared before Cauchy’s time, the 1820s, while they are ubiquitous after 
1650. 

The fundamental problem is to give a meaning to a sum of an infinite 
number of terms , imposing, of course, reasonable conditions; we do not pro- 
pose to explain what the sum 1— 2 + 3 — 4 + 5 — ... might signify, let alone 
the sum of all the real numbers 15 , for it is prudent to restrict oneself to sums 
having no more than a countable infinity of terms. The terms of such a sum 
may, by hypothesis, be put in the form of a sequence ui, U 2 , — Most often, 
they are given in advance in this form, but there are also cases where it would 
be quite artificial to order the terms of the sum as a sequence. Consider for 
example the “lattice” Z 2 of points with integer coordinates in the Cartesian 



15 



An ingenious innocent might observe that, each real number being cancelled by 
its opposite, the sum in question is “obviously” 0 . . . 
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plane M 2 , and suppose that you are interested in the sum of inverses of the 
k th powers of the distances from the origin to the points of the lattice, i.e. 
you wish to assign a meaning to the sum of numbers l/(m 2 + n 2 )*^ 2 , where 
m and n are rational integers, not both zero. We know that Z 2 is countable, 
but there is no privileged, natural or obvious bijection of N onto Z 2 . This 
poses, in this case, the problem of defining the “unordered” sum of the terms 
of the series. We shall study this later (n° 11), but for the moment we confine 
ourselves, maybe wrongly, to the traditional situation of a sum whose terms 
are given in the form of an ordered sequence. 

To obtain a simple series , or series for short, without explicit mention to 
the contrary, one starts with a sequence (u n ) of complex numbers. Since the 
total sum of the u n can reasonably be defined only through an approximation 
procedure involving only the sums of a finite number of terms - the only ones 
that we know at this stage of the exposition -, it is natural to consider the 
numbers 



si = ui 

5 2 = U\ + U 2 

53 = Ui 4- U 2 + U 3 

etc. One calls them the ordered partial sums , or partial sums for short when 
there is no fear of confusion, of the series with general term u ni and one says 
that this is convergent when lims n = s, the sum of the series considered, 
exists. Then one writes 

oo 

5 = u m Or 5 = Ui + U 2 + . . . , 

n—1 

or simply s = So, by definition, 

(6.1) s = lim(ui T . . . + u n ). 

The notation u\ 4- u 2 + . . ., as used by all the Founding Fathers, is now in 
total desuetude, but the reader will maybe find it, at the beginning, easier 
than the other. 

As the convergence of a series reduces to that of a sequence, conversely 
the relation 

Un — U\ ~b (u 2 U-i) T . • • T {u n U n — l) 

reduces the convergence of a sequence to that of a series. 

Some people seem to believe that it is contrary to the principles of sane 
pedagogy to introduce series at the beginning of teaching analysis. At the risk 
of traumatising the reader, let us observe that the nonterminating decimal 
expansion 



X — Xq.X\X 2 . . . 




78 



II - Convergence: Discrete variables 



of a real number means that it is the limit of the sequence whose terms are 



s 0 

si = xo + xi/10 

S 2 = x 0 + xi/10 + ^/lOO 

etc., in other words that 

00 

(6.2) x = xo + xi/10 + £ 2/100 + . . . = ^2 ^n/10 n . 

n=0 



Consider for example the number 

2/15 = 0.13333333333333..., 

according to commercial arithmetic. By the above, the right hand side is the 
sum of the series 1/10 4-3/1 00 + 3/1000 + . . ., plausibly equal to 

(6.3) 1/10 + 3.HT 2 (1 + 1/10 + 1/100 + ...), 

so that it comes down to calculating the sum of the geometric series 

1 + q + q 2 + ... 



for q = 1/10. Now the partial sums are, for q / 1, the numbers 



l+q + q 2 + ...+q n 



1 - q n+1 

1 ~q 



1 q n+1 



When n increases indefinitely, q n+1 tends to 0 if // < 1 (n° 5, example 5), 
so also does q n+1 /(I — q) by the more elementary rules that one will find in 
n° 8. One deduces that 



(6.4) 



1 + q + q 2 + ... = ^q n 

ne N 



1 

T^q 



if \q\ < 1. 



In particular, 

1 + 1/10 + 1/100 + . . . = 1 _\ /w = 10/9. 



Thus one finds the value 1/10 + 3. 10~ 2 . 10/9 for the series (3) and it remains 
to verify that this result is in fact the fraction 2/15 from which we started. 
On replacing q by —q in (4), one finds the formula 



l-q + q 2 -q 3 + ... = ^2(-l) n r 

n=0 



l 



(6.5) 



1 +q 
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already known to Viete, Newton and Mercator, the last two having used it 
around 1665 to calculate the area of a segment of a hyperbola. The math- 
ematicians of the XVII th century, Newton in the first place, obtained these 
series by a very different procedure, the division of 1 by 1 + g according to the 
increasing powers of g; one proceeds as one would in commercial arithmetic 
if q were equal to 1/10: 



1 - q + Q 2 - q 3 



etc., with successive “remainders” equal to — g, g 2 , — g 3 , ... A more econom- 
ical procedure consists of writing 

(i + g)(i -q + q 2 - q 3 + •••) = 

= (1 - q + q 2 - q 3 + . . .) + (q - q 2 + q 3 - . . .) = 1. 

One should pay attention to the fact that these formulae assume \q\ < 1 since 
otherwise the term q n+1 appearing in the partial sum does not tend to any 
limit (n° 5, example 5), so that the geometric series is divergent. Otherwise, 
one could also suppose that q — 1 in (5) and thus obtain the relation 



1-1 + 1 -1 + 1 - 1 + ... = 1/2 



which, fascinating though it is - Jakob Bernoulli “discovered” it in 1696 and 
others got trapped by it before or after this date -, has no meaning: the 
partial sums of the series of the left hand side being alternately 1,0, 1,0,..., 
one cannot see how they could converge! Absurdity would reach even more 
extravagant heights if one put q = 2 in (4); one would thus “discover” that 
1 + 2 + 4 + 8 + 16 + 32 + ... = —1, an example which Nikolaus I Bernoulli 
produced in 1743 in a letter to Euler to warn him away from divergent series 16 . 

Series lead to much stranger formulae, such as 

1 + 1/2 2 + 1/3 2 + 1/4 2 + . . . = tt 2 /6, 

1 + 1/2 4 + 1/3 4 + 1/4 4 + . . . = 7r 4 /90, 

1 + 1/2 6 + 1/3 6 + 1/4 6 + • ■ • = 7r 6 /945, 

1 n + 1 

cot x = - + 2x y tt—z , 

x z -+ x z — n z 7r z 



16 Moritz Cantor, Vorlesungen iiber Geschichte der Mathematik (Teubner, Vol. Ill, 
1901), p. 691. Euler was not convinced; he believed that every series, even di- 
vergent, had a hidden meaning, and, in fact, was the first to calculate with the 
“formal series” of which we will speak in n° 22. But these are not series of 
numbers. 
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a formula valid for all x not a multiple of 7 r; they are all due to Euler (1707- 
1783), as is the expansion 



00 

sinx = x (l — x 2 /n 2 7T 2 ) , 

n= 1 

valid for all x G M, of the function sine as an infinite product 17 . 

We said above that many of the elementary functions (and infinitely many 
others) can be represented conveniently by series, and particularly by power 
series of the form 



CLq ~h CL\X + Qj2X 2 4~ • • • — ^ ^ 



a n x 



where the a n are numerical coefficients; it was Newton who, first, made sys- 
tematic use of them to resolve all sorts of problems; he explains that they 
play a role in analysis analogous to the decimal notation in arithmetic, as the 
relation (2) above confirms, and that the two techniques can even be used in 
the same way, which is a little optimistic. Not yet being able to justify them 
at this stage of the exposition, we shall give examples of remarkable power 
series which, in this chapter, we shall use frequently as experimental material 
to illustrate the interest of theorems of which, otherwise, the reader might 
not see the necessity: 



sinx 

cosx 

e x 

log(l 4- a?) 

(i+*r 



x — x 3 /3! 4- x 5 /5! — x 7 /7! 4- . . . for any x, 

1 - x 2 /2\ 4- x 4 /A\ - x 6 /6! + . . . for any x, 

1 4- x/l\ 4- x 2 /2! 4- x 3 /3! + . . . for any x, 
x — x 2 /2 4 - x 3 /3 — x 4 /4 4- . . . for — 1 < x < 1, 
1 + sx 4- s(s — l)x^ 4- s(s — l)(s — 2)x^ + . . . 

for |x| < 1 



for all real exponents 5 , for example s = 1/2, the first case treated by Newton; 
this is the famous “binomial formula of Newton” which, for s E N, reduces 
to the known algebraic formula since the coefficient of x n is then clearly 
zero for all n > s; see Chap. IV, n° 11. These formulae were discovered by 
Newton (1642-1727) when in 1665-67 the “Great Plague” which ravaged the 
region of London, see Samuel Pepys and Daniel Defoe, forced him to return 
to the countryside of his adolescence where, among other occupations, he 
discovered the composition of white light and the first idea of the law of 
universal gravitation, work in the fields not attracting him particularly. 

17 Given a sequence of numbers (u n ) all ^ 0, one says that the infinite product of 
the u n converges if the partial products p n — u\ . . . u n tend to a nonzero limit; it 
is necessary for this that limu n = 1. The theory reduces easily to that of series 
thanks to the function log, which transforms a product into a sum (Chap. IV, 
n° 17). We shall show in n° 21 how Euler discovered his formula. 
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In practice, all the functions that one meets in classical analysis are repre- 
sentable as power series or, if necessary, by series with a further finite number 
of terms of negative degree like 

l/x 2 (l — x) = x ~ 2 + x~ 1 + 1 + x 4- x 2 + . . . , 

or by series of this type where the variable is a fractional power of x, as in 
the relation 



(x - X 2 ) 1/2 = x 1/2 - x 3/2 /2 - x 5/2 /8 - z 7/2 /16 - 5x 9/2 /128 - . . . 

which Newton deduced from the binomial series for s = 1/2. This fact led 
the mathematicians to study systematically from the XIX th century - there 
was an attempt by Lagrange a little earlier which came to nothing because 
he restricted himself to functions of a real variable - the analytic functions 
of a complex variable, which one can define as follows. Consider a function / 
with complex values defined on an open subset G of C, i.e. such that, for all 
a G G, the set G contains an open disc d(a, z) < r with centre a and radius 
r > 0 (depending on a). The function / is called analytic if, for all a € G, 
there exists a power series 

c 0 (a) -f ci(a)(z — a) + c 2 (a)(z - a) 2 + . . . = ^Pc n (a)(z - a) n 



whose coefficients depend on a and which (i) converges for \z — a\ sufficiently 
small, (ii) has sum f(z) on a neighbourhood of a, whence necessarily co(a) = 
f(a). Take for example the function f(z) = 1/z, defined on the open set 
z / 0. For a/0, one can write 



1 _ 1 

z a — (a — z) 



1 1 
a 1 — (a — z)/a 



Y,(a- z) n /a n+1 

nE N 



on condition that \(a — z)/a\ < 1, i.e. \z — a\ < |a|; the function 1/z is thus 
represented by the power series in z — a that we have just written, in the 
largest disc with centre a not containing - which is normal - the point z — 0. 
Here c n (a ) = (— l) n /a n+1 . 

Cauchy (1789-1857) was the first to observe, having consecrated thirty 
years of work to them before seeing it clearly, that these functions possess 
extraordinary properties which nearly all the mathematicians of the XIX th 
century, and a good subset of their successors, have put to work from one 
time to an other, and have in particular generalised to functions of several 
complex variables. 



7 — The marvels of the harmonic series 

Let us return to the elementary theory of convergent sequences. Remember 
that a sequence (v n ) is a subsequence of a sequence (u n ) if there exists a 
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strictly increasing sequence of integers pi,P2> • • • such that 18 

v(n) = u{p n ) for all n. 

For example, the sequence (1 /n 2 ) is a subsequence of the sequence (1 /n). 

If a sequence ( u n ) converges to a limit u , then every subsequence of (u n ) 
converges to u. With the notation above, one has p n > n for all n since 
p n > . . . > pi > 1; the relation 

d(u, u n ) < r for all n > N 

then implies that d(u,v n ) < r for all n> N, qed. 

This trivial result translates usefully into the language of series. To extract 
a subsequence from the sequence of partial sums s n of a series u(n) one 
chooses as above a sequence of integers p n and considers the sequence whose 
terms are 

s(pi) = tt(l) + • ■ • + u(pi), s(p 2 ) = m(1) + . . . + u(p 2 ), 

etc. These are manifestly the partial sums of the series whose successive terms 
are 

■Ul = u(l) + . . . + u(pi), v 2 = u(pi + 1) + . . . + u{p 2 ), 

V3 = U(p 2 + 1) + . . . + U(p 3 ), . . . 

in other words, of the series obtained by grouping the terms of the initial 
series into blocks of p \ , p 2 — Pi , P 3 — P 2 , • • • terms as if dealing with a finite 
sum. Theorem 1 shows that, if the initial series converges, so does the new, 
the two series having the same sum. This is an extension of the associativity 
of addition: one has 

u(l)+u(2)+u(3) + . . . = [u(l) 4- . . . + w(pi)] + [u(pi + 1) + ... -b u(p 2 )} + . . . 

as for finite sums so long as the left hand side converges. One should be 
aware of the fact that though a series may become convergent after grouping 
its terms, it does not follow that it was already so before this operation: the 
series (1 — 1) 4- (1 — 1) 4- . . . has no merit in being convergent, and the series 
1 — 14-1 — 14-.. . is divergent. This difficulty does not arise with series of 
positive terms as we shall see in n° 12. 

The preceding artifice can serve to prove the divergence of a series, for 
example of the harmonic series 

1 + 1/2 4-1/3 + ... . 

If it were indeed convergent, so would be the series 



18 The functional expression u(n) shows that a subsequence is only a particular case 
of the general concept of composition of maps: compose n h - > p n and n »— > u n . 
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1/2 + (1/3 + 1/4) + (1/5 + 1/6 + 1/7 + 1/8) + . . . 

obtained by grouping of 1, 2, 4, 8, 16, . . . terms in the initial series (one omits 
the first term which clearly plays no role in the question) . Now the first group 
of terms has a value >1/2, the second, being the sum of two terms greater 
than 1/4, is the same, the third again, since it is the sum of four terms greater 
than 1/8, etc. One thus finds, for the new series, partial sums successively 
greater than 1/2, 1, 3/2, etc.; whence the divergence of the new series and 
so of the harmonic series. This kind of argument will be generalised in n° 12 
(“Cauchy’s condensation criterion”). 

There are many variants of the preceding proof; they have sometimes given 
rise, historically, to spectacular errors 19 of a nature to commend prudence to 
readers who are starting on the subject. A little after 1650, the Italian Pietro 
Mengoli observed that one always has 

1113 

- H 1 ~—r > ~ ; 

n — 1 n n + 1 n 

so, by grouping the terms of the harmonic series, one obtains 



1 + (1/2 4- 1/3 + 1/4) + (1/5 + 1/6 + 1/7) + . . . > 1 + 3/3 + 3/6 + . . . 

= 1 + 1 + (1/2 + 1/3 + 1/4) + (1/5 + 1/6 + 1/7) + (1/8 + . . . 

> 1 + 1 + 3/3 + 3/6 + . . . = 1 + 1 + 1 + (1/2 + 1/3 + 1/4) + . . . 

etc., which shows that the proposed sum s of the series is greater than any 
integer. Instead of advertising his ingenuousness, Mengoli should have con- 
fined himself to observing that the first of his inequalities already provides 
the rather fishy relation s > s + 1 . 

Forty years later, Johann Bernoulli used an analogous idea. He started 
from the relation 

1/1.2 + 1/2.3 + 1/3.4-f... = 1, 
obvious if one writes 20 it as 

(1 — 1/2) + (1/2 — 1/3) + . . . = 1, 
and then remarked that 

1/2 + 1/3 + 1/4 + . . . = 1/1.2 + 2/2.3 + 3/3.4 + ... 

= (1/1.2 + 1/2.3 + 1/3.4 + ...) + (1/2.3 + 1/3.4 + ...) + (1/3.4 + . . .) 
= 1 + (1 - 1/2) + (1 - 1/2 - 1/6) + (1 - 1/2 - 1/6 - 1/12) + . . . 

= 1 + 1/2 + 1/3 4 1/4 + ..., 

19 I find them in Cantor, Vorlesungen . . ., Vol. Ill, particularly pp. 94-96. 

20 The relation in question is obvious if one calculates as with a finite sum since 
all the terms apart from the first “visibly” cancel in pairs. But the correct proof 
consists of remarking that the sum of the first terms n, namely 1 — 1/n, tends 
to 1 as 1/n tends to 0. 
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whence the relation s = 1 -f s for the “sum” s of the series. Immediately 
Jakob Bernoulli, his elder brother, observed that the partial sum 

l/(a 4- 1) + ... -f 1/a 2 

has a 2 — a terms all greater than 1/a 2 , so has a value greater than 1 — 1/a, 
whence it follows that l/a-f... + 1/a 2 > 1. From here it is easy to extract 
from the harmonic series groupings of terms greater than an arbitrary inte- 
ger. Jakob Bernoulli observed at that point that a series whose terms tend 
to 0 - a condition clearly necessary for convergence since u n = s n — s n -i is 
the difference of two sequences which tend to the same limit - can still be 
divergent. 

The fact that he had revealed the absurd marvels of the harmonic series 
did not prevent Jakob, several years later, from baldly putting 

A = 1/1 + 1/2 + 1/3 + ..., 



and then deducing that 

A- 1-1/2 = 1/3 + 1/4 4 -... 
and then, by subtraction, that 

3/2 - 2/1.3 4- 2/2.4 + 2/3.5 + . . . , 

which provided him a “new proof”, this time correct, of a formula obtained 
by Leibniz in 1682: 



1/1.3 + 1/2.4 4- 1/3.5 + . . . = 3/4. 

Similarly, putting E = 1/1 + 1/34-1/54-... (the series diverges), whence of 
course E — 1 = 1/3 + 1/5 + . . ., he obtains by difference and division by 2 
another correct formula of Leibniz 5 : 

1/1.3 + 1/3.5 + 1/5.7 + . . . = 1/2. 

These calculations are meaningless. The usual rules of algebra were de- 
veloped for calculating finite sums, i.e. consisting of only a finite number of 
terms; it is sometimes legitimate to apply them to convergent series, and al- 
most always, as we shall see later, to the absolutely convergent series which 
we shall introduce in n° 15, not because they are obvious, but because the 
mathematicians of the XIX th century have proved the indispensable general 
theorems. 

The same Jakob Bernoulli would give another precarious example in 1692. 
He starts from the relations 

1/1 + 1/2 + 1/4 + 1/8 + ... - 2/1, 

1/3 + 1/6 + 1/12 + 1/24 +... = 2/3 

1/5 + 1/10 + 1/20 + 1/40 +... = 2/5 
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obtained by dividing the relation 

1 4- 1/2 + 1/2 2 + 1/2 3 + . . . = 2, 

itself obtained on putting q = 1/2 in the sum (6.4) of the geometric series, 
by 1, 3, 5, . . . Having done this, Bernoulli adds these relations side-by-side as 
one might do with a finite number of finite sums. One obtains the harmonic 
series ^2 1 / n f° r the left hand side for the reason that any integer n can be 
written in one and only one way as the product of a power of 2 and of an odd 
number, and so appears once and only once among the left hand side terms. 
By this ingenious procedure one finds the formula 

1 + 1/2 + 1/3 + . . . = 2(1 + 1/3 + 1/5 + . . .) 
whence “evidently” 

1/2 — j— 1/4 — {— 1/6 — f— . . . = 1 + 1/3 + 1/5 + . . . 

despite the fact that each term of the left hand side is strictly less than the 
corresponding term of the right hand side. One might push the paradox even 
further, substituting the right hand side into the left hand side; one would 
thus obtain the formula 

1/1.2 + 1/3.4 + 1/5.6 + ... = 0, 

particularly miraculous since the terms of the left hand side are all > 0. We 
will see in n° 18, Corollary of Theorem 13, how one can justify this type of 
operation, subject to hypotheses not satisfied in the preceding case. 

One would be wrong to laugh at the Bernoullis. Even if, like the majority 
of their contemporaries, they evinced an excessive penchant for virtuosity, 
they did not have behind them three centuries of mathematicians who had 
totally eliminated the difficulties inherent in the conception and use of series; 
they were in process of inventing the subject starting from nothing or nearly 
so. This protestant family, which left Anvers for Frankfort and then Bale 
when the henchmen of the supercatholic Philippe II put down the revolt of 
the Low Countries at the end of the XVI th century, produced eight math- 
ematicians - the two brothers Jakob (1654-1705) and Johann (1667-1748), 
the son Nikolaus (1687-1759) of a brother of these two, three sons of Johann, 
Nikolaus II (1695-1726), Daniel (1700-1782) and Johann II (1710-1790) and 
two sons of his, Johann III (1744-1807) and Jakob II (1759-1789) - well 
known or famous, without speaking of their activities as physicists, jurists, 
doctors, hellenists, etc.; see their notices in the DSB. The whole XVIII th 
century calculated like them, notably Euler, the Bach of mathematics, stu- 
dent of Johann, and we shall yet see Fourier obtain prodigious results around 
1807, by using series much more outrageously divergent than those of the 
Bernoullis. 
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After these examples, and those we shall present later in this chapter, one 
will understand perhaps a little better why the mathematicians of the XIX th 
century and yet more of the XX th , tired of false proofs, by great mathemati- 
cians, of generally correct theorems (there were surely also innumerable false 
theorems produced by lesser masters, but they have not passed to posterity), 
have finally propounded, at least implicitly, the following basic principles: 

(i) every assertion which is not fully proved is potentially false and is only, 
at best, an interesting conjecture , 

(ii) using an incompletely proved assertion to prove others increases the risk 
of error exponentially , 

(iii) the duty to prove an assertion falls on the author 

even if his colleagues do not refrain, on occasion, from doing so in his place, or 
from demolishing it. Naturally there are conjectures which people have tried 
to prove for decades, even centuries: Fermat’s Last Theorem, the Goldbach 
and Riemann Conjectures, etc. But to formulate or prove such hypotheses is 
not the lot of everyone ... 

The observation of these principles has led to the formidable intellectual 
discipline that mathematicians have progressively imposed on themselves for 
a century. One finds it nowhere else to the same degree. In physics theoreti- 
cians often take great liberties with the mathematics, their inspired intuitions 
being sufficient; the experimentalists insist on the reproducibility of their ex- 
periences, which, in Big Science , can lead far, even though one sometimes 
works on hypotheses which may be revealed to be totally false. True histo- 
rians may attempt to observe the mathematicians’ rules, but the inevitable 
gaps in their information, the need to check sometimes falsified documents, 
and to interpret them objectively, makes it difficult. And imagine the career 
of a politician who applied these rules. 

An example of an assertion falling directly within the scope of prin- 
ciples (i), (ii) and (iii): it is thanks to nuclear arms that the Third 
World War has been avoided. Repeated ad nauseam for decades with- 
out the least proof being provided (one invokes Munich, or the Soviet 
regime, imperialism and arms, or even the precedent of Pearl Harbor; 
whereas the Foreign Policy of the Soviets has always been radically 
different from that of the Nazis or the Japanese before 1939, etc.), 
this assertion ignores all sorts of arguments which do not lead in the 
same direction. 

(1) Nuclear or not, the horrors of two World Wars and the almost to- 
tal unpredictability of this kind of enterprise, might have been enough 
to dissuade amateurs less mad than Adolf Hitler; look at the enthu- 
siasm of the French, British and Soviet leaders faced with Hitler 
already before September 1939; the USSR did not enter the war un- 
til attacked by the Nazis; the USA waited for Pearl Harbor and a 
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declaration of war by Hitler. With the exception of the USA, which 
came out of the war much more powerful that it was at the begin- 
ning, and without comparatively great human loss (300 000 dead - 
the War of Secession caused double - against the twenty millions of 
the USSR, seven of Germany, etc.), all the belligerents were ruined 
or demolished to a point heretofore unimaginable; as the American 
diplomat George Kennan wrote from Moscow, “one has to see it to 
believe it” . 

(2) It has been recognised in the USA since 1947 that the East- West 
conflict takes place on an ideological much more than territorial level. 

The theoretical programme of the Soviets put much more stress on 
helping the internal revolutions in the Third World rather than on 
military conquests; their occupation of Central Europe, was, from the 
point of view of their strategy, in the first place justified by a possible 
resurrection of the German peril (also as much feared by the French) 
and, later, to provide them a defense or advance starting point in 
case of war against NATO. The first priority of the Soviets has al- 
ways been to preserve their regime: no adventurism, except under 
Khrushchev who lost his position on this account. That of the Amer- 
icans since 1945 is to retain their “ preponderance of power*’, as Leffler 
says, their influence on the noncommunist world, and a technological 
superiority intended to maximise the losses of a possible adversary 
while minimising their own. The Pacific War cost one hundred and 
four thousand American dead but at least nine hundred thousand to 
the Japanese military (and nearly as many civilians). The wars of 
Korea and Vietnam each brought some tens of thousands of Ameri- 
can dead, but two or three million Koreans and Chinese and as many 
in the Indo-Chinese population; they were not all due to the Amer- 
icans - all the belligerents behaved like savages - but the enormous 
superiority of American arms explains a great deal. The first war in 
Iraq cost 147 or 148 American dead, and maybe 100 000 Iraqis. 

(3) It was the Hiroshima bomb which instantly convinced the Soviets 
that America constituted a “mortal menace” to them and engaged 
them in a frenetic nuclear arms race; one knows now that, from the 
end of August 1945, the Pentagon had a list of several dozens of Soviet 
cities and industrial zones to atomise in case of war 21 . Addressed in 
particular to the head of the Manhattan Project, the plan of 1945 
and those, much more apocalyptic, which have followed it, clearly 
indicated the desire of the military to be in a position to devastate 
the USSR in future if need be. 

21 Edward Zuckerman, The Day After World War III ( Avon Books, 1987), pp. 181- 
183, who refers on p. 368 to a memorandum of General Norstad, and Richard 
Rhodes, Dark Sun. The Making of the Hydrogen Bomb (Simon & Schuster, 1995), 
pp. 23-24, who cites the archives of the Manhattan Project. 
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From the political point of view, it was clearly not a question of ex- 
ecuting such a plan in 1945. In fact, the production of bombs, which 
would have reached six per month by the end of the year if the war 
had continued, ended in practice thus: the majority of the partic- 
ipants in the project returned to civil life, the much too expensive 
installations were suppressed, and the others improved without great 
urgency, though the reduced scientific teams remaining in place pre- 
pared the future improvements (miniaturisation and thermonuclear) 
with the help of university consultants. The result was that America 
had only 13 bombs in 1947, to the great stupefaction of President 
Truman himself, who was relying on his winning weapon to base his 
politics; the Soviets, probably better informed, had not passed onto 
the offensive, if not on the diplomatic plane . . . 

In fact, the policy of “containment” of the USSR inaugurated in 1947 
on the inspiration of Kennan did not truly emphasise nuclear arms 
until after the first Soviet explosion of August 1949, the principal 
effort concerning, before, aeronautics and principally the production 
or the development of the strategic bombers B-36, B-47 and B-52. 
Foreseen since 1947 by Kennan, the final collapse or the “ gradual 
mellowing ” of the Soviet regime, as he called it, would have to be 
caused by internal problems. 

(4) The creation in January 1947 of a new administrative structure, 
the Atomic Energy Commission under civil control, and the acceler- 
ation of the Cold War raised the stock to 298 in 1950, 2,280 in 1955, 
12,305 in 1959 (official figures) and to 32,500 in 1967 and then to di- 
minish progressively. The Soviet stock, estimated at 200 in 1955 and 
at 1,050 in 1959, widely surpassed the American maximum at the 
end of the 1970s ( Bulletin of the Atomic Scientists , 12/1993). The 
study by Stephen I. Schwartz et al., Atomic Audit. The Cost and 
Consequences of U.S. Nuclear Weapons since 19f0 (Brookings Inst. 
Press, 1998, 680 p.), where one finds not only the figures, estimates 
the minimum cost of nuclear weapons for the USA alone at 5 821 
billion dollars at 1996 values (French GNP for that year: 1280), of 
which 409 was for nuclear arms in the strict sense, the rest covering 
vectors, anti-aircraft and then anti-missile defence, satellites, etc. 
The multiplication of nuclear arms and of their vectors in the two 
camps could only powerfully contribute to accentuate their feelings 
of insecurity and their mutual hostility, as shown by the innumer- 
able allusions to the “Soviet menace” in the West, and notably in 
France (it threatened in the first place Soviet citizens, particularly 
under Stalin . . .), also the recent evidence of Russian atomic physi- 
cists seeking to protect their country from the “American peril” , not 
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to be neglected at the beginning of the 22 1 950s and considered, right 
or wrong, as very serious during the Reagan period. 

(5) By far the most grave crisis of the Cold War was triggered by 
the clandestine installation in Cuba in 1962 of Soviet nuclear arms 
in response to the American arms in Europe, principally the mis- 
siles in Turkey, and the preparatory moves to invading Cuba. Some 
claim that the peaceful resolution of the crisis proved the efficacity 
of “deterrence” . Apart from the fact that there would not have been 
a “Cuban crisis” of such dimensions if not for nuclear arms, new 
information has been available for some years. At the height of the 
crisis, at the point of being stopped, the famous Colonel Penkowski, 
who had communicated a mass of information to the Americans con- 
cerning the arms of his country, is supposed to have sent a coded 
message to the CIA informing them of an imminent Soviet attack; in 
view of his personality the two employees of the CIA who received 
the message decided not to transmit it to their superiors (information 
impossible to confirm, but published by Raymond Garthoff, author 
of massive studies on American- Soviet relations). One knows that the 
Soviets left more than 40,000 men in Cuba, and not 10,000 as the 
CIA believed, with tactical nuclear arms which the local Soviet com- 
mandant was authorised to use in case of an American invasion, a 
contingency which might very well have been realised if Khrushchev 
had waited a few more days before throwing in the sponge: imagine 
the American reaction in such a case. One knows that, in the two 
camps, the military chiefs protested violently against the outcome of 
the crisis: they were for immediate invasion by the USA - everything 
was ready - and against “capitulation” in USSR. At the height of 
the crisis, the chief of the Strategic Air Command took it on himself 
to send the order in clear to the B-52s loaded with bombs which flew 
constantly towards the USSR to go on a state of quasi-maximum 
alert, the American Navy harassing the Soviet submarines even in 
the Pacific. The then Secretary of Defense, Robert McNamara, later 
became the principal advocate of total abolition of nuclear arms. 

22 At this time, which follows the first Soviet explosion of August 1949, one sees a 
part of the press and numerous civilian or military personnel advocate a preven- 
tive attack while the Russians did not yet have a stock of bombs; for example, our 
colleague von Neumann (Logic and Set Theory, Hilbert Spaces, Game Theory, 
Implosion Theory for the Nagasaki Bomb, Programmable Computers, H Bomb, 
Missiles), then one of the most influential advisers of the Pentagon, is supposed 
to have said: If you say why not bomb them tomorrow, I say why not today. 
If you say today at five o’clock, I say why not one o’clock , if you believe Life , 
25/2/1957, on the occasion of his death. But neither Truman nor Eisenhower 
was disposed to take this risk, as, among other historians, Marc Trachtenberg 
explains lucidly, History and Strategy (Princeton UP, 1991), in a chapter enti- 
tled A “Wasting Asset”: American Strategy and the Shifting Nuclear Balance, 
194 9-1 954 • See also Rhodes, Dark Sun , pp. 562-568. 
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(6) The balance of power between the two camps was considered 
highly precarious for a long time. In clear logic, it would have been 
enough for both sides to have had about a hundred submarine-borne 
nuclear arms to be able to immobilise each other; the fact that this 
strategy was not adopted indicates that no one was confident in nu- 
clear arms as a guarantee of deterrence, despite the fact that nobody 
has yet found a way to protect oneself from them. To mention just 
one significant detail, it was to limit the damage from a possible 
Soviet attack that Paul Baran, at the Rand Corporation, invented 
packet switching in 1964-5, and out of this came the Arpanet and 
Internet. The Rand Corporation, the foremost “think tank” working 
for the Pentagon, was at the time developing strategies for nuclear 
war (Herman Kahn, Albert Wohlstetter, etc.). 

Maintaining the balance of power has, in fact, justified a race for 
technological innovations, almost all born in America and always 
replicated more or less faithfully in the USSR, a race destined to 
assure that neither of the two protagonists had a sufficient superi- 
ority to attempt to rub out the other without taking risks said to 
be “unacceptable” (at least 20% of the population and 60% of in- 
dustry . . . ) . This suggests that despite the nuclear arms intended 
to guarantee the peace, each of the two protagonists attributed to 
the other, right or wrong, the temptation to attack unannounced. 
Concerning possible ground-based operations in Europe, the USSR 
has consistently given itself the very expensive means to set off a 
massive offensive, while NATO has provided itself very early with 
the means, principally nuclear because more economical, to stop her 
(for the 1950s see the memoirs of General Gallois, who at the time 
was the French member of NATO’s Nuclear Planning Group), after 
which the Soviets in their turn adopted “tactical” nuclear arms. The 
invention by the Americans of missiles with multiple heads (MIRV) 
and the race to precision, officially justified by the need to destroy 
the enemy missiles before launch, would have, in case of an acute 
crisis, obliged each camp to shoot first in order not to destroy empty 
silos. This was pointed out at the time by American physicists very 
competent in the matter, and opposed to this strategy of which the 
existence of almost invulnerable submarines on both sides reinforced 
the absurdity. 

(7) An explanation which does not exclude the preceding one consists 
of thinking that in reality we are dealing with a race to bankruptcy 
in which the winner will be the loser. The Western economy, or even 
only the American, has always represented, at the very least, three 
times the Soviet (ten times in 1945 according to Kennan). In terms 
of GNP the arms race has necessarily weighed more heavily on the 
Soviet economy rather than on the American throughout this period 
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(Soutou, p. 666, and Russian economists speak of 20 - 40% of the 
Soviet GNP during the 1980s, others of 15-20% during the whole 
period). The arms race thus contributed to prove Kennan right (he, 
opposed to nuclear madness, retreated rapidly and for life to the 
Institute for Advanced Study at Princeton . . . ). It also confirmed the 
true or assumed ineffectiveness of the socialist system, the besieged 
the besieger since its birth. Further, it dispensed the United States 
from maintaining a vast ground army which would have been much 
more expensive. 

While raising, thanks to the Korean War, the military budget to over 
10% of the American GNP - an evolution advocated, three months 
before the outbreak of the war, by a National Security Council well 
aware of the immense superiority of American industry -, President 
Eisenhower had, to protect the American economy, always refused to 
go further because, as he said at the time, the confrontation risked 
lasting until the end of the century ; America never spent more than 
7% after 1970. About fifteen years ago it was again claimed seriously 
that it was economically impossible, for the West, to produce 3,000 
heavy tanks per year as, we have been told, the Soviets were doing. 
It is true that at this period the NATO countries were only just 
capable of producing more than twenty million cars each year and 
some hundreds of thousands of heavy vehicles, agricultural machines 
and machinery of all sorts . . . 

Let us add that the documents, mainly Soviet and French, which 
would be indispensable to a true comprehension of the situation, re- 
main largely inaccessible. The only certitudes are that the ideological 
hostility between the USA and the USSR dates from 1917 (no diplo- 
matic relations before 1933) and not from 1945; that after 1945 the 
two camps, like the French and the Germans before 1914, steadily 
perfected their arms and their war plans; that nuclear arms have, in 
the two camps, contributed to reinforce the instinctive recoil from 
the catastrophe; but that their existence would have, in the case of 
an acute crisis, precipitated it because of the drastic reduction in the 
delay of possible reaction and of the fact that, compared to these, 
“ the concentration camps and the gas chambers are just the work of 
artisans ” (Pierre Sudreau, L ’enchainement, Plon, 1967, p. 209, by a 
non-orthodox Gaullist). As for imagining a realistic version of what 
history would have been without nuclear arms, the exercise is mean- 
ingless: history is not an experimental science. 

In a century, the dominant theory will perhaps be that it was despite 
nuclear arms that the Third World War was avoided before 1997 
(wait before pronouncing on the future). This is what some poli- 
tologues or historians have begun to suggest, for example, Michael 
MccGwire, Deterrence: the problem - not the solution (International 
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Affairs, 1986, pp. 55-70), Soutou, too, being rather sceptical. John 
Mueller, Retreat from Doomsday. The Obsolescence of Major War 
(Basic Books, 1989) and John Keegan, The Second World War (Pen- 
guin Books, 1990, pp. 594-595) invoke the experience of two world 
wars. Some others think that nuclear arms have not served for any- 
thing from this point of view, which is surely the case of the French 
under de Gaulle, that had principally for their military mission to 
transform a possible classical conflict into a nuclear war. If they are 
right, the success of “nuclear deterrence” would be due to the fact 
that there was no one to deter, or, as Mueller says, because the true 
deterrent was Detroit : the enormous American industrial superiority, 
nuclear or not. 

One can, in French, assess the extent of the subject in Pierre Grosser, 
Les temps de la guerre froide (Bruxelles, Ed. Complexe, 1996) who 
cites almost all that is in the public domain, but confines himself 
to very brief generalities on what concerns the arms race and its 
connections with scientific and technological progress. The same is 
true of George-Henri Soutou, La Guerre de Cinquante Ans (Paris, 
Fayard, 2001), an otherwise superb book by a top historian. The 
majority of other French authors, particularly the experts in strat- 
egy, have restrained themselves for decades to sound the tocsin for a 
conflagration [the invasion of Europe] which never happened (Samuel 
Huntington) while frequently throwing oil on the flames. We can cite, 
from the immense American literature, several serious books: Daniel 
Yergin, Shattered Peace. The Origins of the Cold War and the Na- 
tional Security State (Houghton Mifflin, 1977), George F. Kennan, 
The Nuclear Delusion (Pantheon Books, 1982), Fred Kaplan, The 
Wizards of Armageddon (Simon & Schuster, 1983) on the history 
of American nuclear strategy, McGeorge Bundy, Danger and Sur- 
vival. Choices About the Bomb in the First Fifty Years (Vintage, 
1990), Samuel R. Williamson, Jr and Steven L. Rearden, The Ori- 
gins of U.S. Nuclear Strategy , 194-5-1953 (St. Martin’s Press), Melvin 
Leffler, A Preponderance of Power. National Security, the Truman 
Administration, and the Cold War (Stanford UP, 1992), and The 
Specter of Communism. The United States and the Origins of the 
Cold War, 1917-1953 (Hill and Wang, 1994), John Lewis Gaddis, 
We Know Now. Rethinking Cold War History (OUP, 1997). For a 
balanced and easy to read exposition see Martin Walker, The Cold 
War and the Making of the Modern World (Vintage, 1994), by a 
British journalist who has covered the subject but, like many other 
authors, does not seem to have observed that the marvels of high 
tech are one of the principal contributions of the Cold War to the 
construction of the “Modern World”, and that, in this domain too, 
the children inherit their parents’ genes. 
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Let us return to the mathematicians. As in every very hierarchical com- 
munity obeying strict rules, for about fifty years discipline has been imposed 
by a sort of policing. Before publishing an article, and of course not only in 
mathematics, every serious journal now submits it to specialists, the “refer- 
ees” or “gate keepers of Science” as a sociologist has called them, or, if one 
prefers a French term, the sentinels of Science. They do not fail to detect the 
weaknesses, particularly when the results announced are so very important 
that some of the referees are angry at not having found them themselves 
before the author; this system has contributed to improving the level of pub- 
lications considerably. Moreover, all the authors of scientific articles now send 
“preprints” (“hard” or “virtual”) to their colleagues, to arrive six or twelve 
months before the printed article. This is useful to one’s colleagues and frees 
them from following the same path if they were on it - thus doubly assuring 
one’s priority -, but can also lead these to criticise more or less severely the 
articles that they receive. Not everyone reacts like Legendre, the great expert 
on the theory of elliptic functions under the Revolution and the Empire, who, 
in his late days, about 1825, on learning the results of Abel and of Jacobi 
which totally revolutionised the subject, was extremely delighted at such a 
“great step forward” and only complained that these young people born with 
the century advanced too fast for him at seventy still to be able to hope to 
follow them . . . 

This is what happened a few years ago to Andrew Wiles with his proof, 
seven years of work, of “Fermat’s Last Theorem” which generations of math- 
ematicians have ogled for three centuries, as generations of alpinists ogled 
Everest before Edmund Hillary. One might well think that the referees set to 
on Wiles’ preprints, breaking off all other activities, fell onto the object in 
the hope or fear, according to the degree of generosity of the person consid- 
ered, of finding gaps or errors as had always been the case in the past; there 
was indeed an error in the calculation which, apparently, vitiated it all. But 
Wiles corrected it a year later with the help of one of the referees; those of 
the gate keepers who were angry at having lost their “first” when they were 
so close had their hopes dashed. It is better not to think of what would have 
happened to Wiles if his proof had been irremediable, as happened recently 
a propos another Everest. Some of the more prestigious authorities decided 
that the author - although brilliant and, it is the least one can say, not lack- 
ing in courage - of this false proof that used mathematics other than their 
own “was not of the required level” to succeed. Neither were they, until proof 
to the contrary . . . 

One can also meditate on the recent affair of the “memory of water” when 
a French biologist in other respects well-reputed, having published a revolu- 
tionary theory depending on difficult or impossible to reproduce experiments, 
was expelled with the ultimate brutality from the “community” by dozens 
of pontiffs of Physics and of Biology, agitating against the spectres of irra- 
tionality, of homeopathy and fraud, and claiming that his publications were 
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capable of “harming the image of French science”. As if that depended on 
the work of only one man, and as if “American science” was discredited when 
a biologist made up his mice in black and white to prove his theory correct! 
On this topic let us cite the opinion of Robert Hutchins 23 in 1963: 

There have been very few scientific frauds. This is because a scientist would 
be a fool to commit a scientific fraud when he can commit frauds every 
day on his wife, his associates, the president of the university, and the 
grocer ... A scientist has a limited education. He labours on the topic of 
his dissertation, wins the Nobel Prize by the time he is 35, and suddenly 
has nothing to do ... He has no alternative but to spend the rest of his 
life making a nuisance of himself. 

The author of this declaration has clearly let himself be led by his taste 
for paradox; one should however note that he presided over the University of 
Chicago before, during and after the war and so has had some observations 
to deliver . . . 

And Euler? His Complete Works - more precisely, those that he prepared 
himself with a view to possible publication -, in course of being edited in 
Germany, and then in Switzerland since 1911, must comprise about 80 vol- 
umes in quarto, of which 29 are on mathematics not to speak of mechanics, 
of hydraulics, of astronomy, of mathematical physics, of optics, of shipbuild- 
ing theory and navigation, of geodesy, of artillery, of financial mathematics 
and of the widely circulated Lettres a une princesse d’Allemagne sur divers 
sujets de physique et de philosophies intended for the education of a niece 
of Frederick II 24 . He would have done better to confide this heavy task to 
people such as Diderot, d’Alembert or Voltaire who inspired him with consid- 
erably more sympathy than Euler and his Calvinism (his father was a pastor 
in Bale 25 , and it was mathematics which without changing his ideas even- 
tually dissuaded the young Euler from following him). There is also, as well 
as these 80 volumes, an enormous correspondence of which the essential part 
remains to be published. He lost an eye in 1738 and was blind from 1771, but 
continued to work to the same rhythm. 

Euler did not publish all that he wrote, far from it, (560 books and articles 
during his life, but there about 300 others), principally because he wrote 

23 Cited in Daniel S. Greenberg, The Politics of American Science (Penguin Books, 
1969, original American title: The Politics of Pure Sciences American Library, 
1967). Greenberg collaborated in the great journal Sciences which he had to leave 
because of his disrespectful points of view. His book is still worth reading. 

24 “The King calls me ‘my professor’, and and I am the happiest of men” (Hairer 
and Wanner, p. 159). However this did not stay so for very long, Frederick II 
appreciating much more his capabilities as administrator of the Berlin Academy 
than his “useless” mathematics. 

25 In the XVII th and XVIII th centuries, and even more in the XIX th , many scientists 
and more generally intellectuals were, in the protestant countries, sons of pastors, 
no doubt because these were educated people. The corresponding phenomenon is 
found more rarely in the catholic countries, but since the instruction there was 
almost totally monopolised by the religious schools, the results were not very 
different until the Revolution. 




§1. Convergent sequences and series 



95 



too much for the capacities of the time; a legend, maybe apocryphal, but 
significant, tells that, when an editor came to ask him for a paper, he just 
handed him the topmost of the pile of his latest productions. With such habits 
and inspired proofs, though a little or very false as we shall see on various 
occasions, he would have had a lot of trouble nowadays. Of course, in our 
time, he would have been educated by mathematicians more “serious” than 
Johann Bernoulli and would have conformed to the rules of the corporation 
as, in his time, he conformed in his private life, to those he had absorbed in 
the puritan society of Bale. Happily there were, mainly in France, people to 
advance other things than mathematics, hydraulics and artillery. 

8 — Algebraic operations on limits 

It is indispensable to know what happens when one performs simple algebraic 
operations on sequences which tend to limits. The theory rests on a very few 
results. 

Theorem 1. Let (u n ) and (v n ) be two convergent sequences, with limits u 
and v. Then the sequences (u n + v n ) and (u n v n ) converge to u + v and uv. 
Ifv^O, then v n ^ 0 for large n, and the sequence (u n /v n ), defined for large 
n, converges to u/v. 

In other words, 

lim(u n + v n ) = lim u n + lim v n , 
lim(u n u n ) = (lim u n ).(lim v n ), 
lim (u n /v n ) = (lim u n )/ (lim v n ) if limu n ± 0. 

One can add the sums of series too - consider the partial sums: 

^ ^ (^n T ^n) ~ ^ a n T ^ u n . 

Let us move on to the proof of Theorem 1 . 

Case of a sum. We need only write that 

\{u n + V n ) - (u + u)| < I u n - u\ + \v n - u|; 

for n sufficiently large, each of two last differences is < r/2; the left hand side 
is thus < r, qed. 

Case of a product. This relies on the following lemma (the continuity of 
the map (x, y) i— > xy of C 2 into C): 

Lemma. Let u and v be complex numbers. For every r > 0 there exists a 
number r' > 0 such that the relations 

(8.1) | u! — u\ < r' & \v r — v\ < r' imply \ u'v' — uv | < r. 
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Note that 

u'v' — uv — {v! — u)(v' — v) + v(u' — u) + u(v' — v). 

The two first inequalities (1) thus imply 

| u'v' — uv | < r' 2 + ar' where a — \u\ + M* 

For r' < 1, the right hand side is < (1 -f a)r f . The first will thus be < r so 
long as one chooses r' < min(l,r/(l + a)), qed. 

To deduce from this the case of Theorem 1 which interests us here, it is 
enough to replace u' and v' by u n and v n ; the two conditions of (1) are then 
satisfied for all sufficiently large n, and the third relation provides the result. 
(Here as always, one exploits the fact that if two relations are separately valid 
for n large, then they are also valid simultaneously.) 

Case of a quotient. Given that u n /v n = u n x l /v n , it is enough to examine 
l/v n and then to apply the result for a product. 

The fact that v n ^ 0 for large n is clear: for n large, one has for example 
\v n — v | < M/2 since the right hand side is > 0; it follows that \v n \ > \v\/2 > 0. 
To imitate the preceding argument let us put v n = v' . We need to evaluate 

\l/v' — l/v\ = \v f — v\/\vv'\. 

For n large, the numerator of the right hand side is < r'. Now we have just 
seen that the denominator is greater than \v\ 2 /2. The right hand side is thus 
< 2r / /M 2 , so < r provided that r f < r M 2 /2, qed. 

Theorem 1 allows one to calculate a large number of limits easily, if only 
very simple ones. For example, the sequence with general term 

n 2 — 1 1 — 1/n 2 

on — — 

71 3n 2 + n + 1 3 -f 1/n + 1/n 2 

tends to 1/3 since 1/n and 1/n 2 tend to 0, so that the two parts of the 
fraction tend to 1 and 3 respectively. 

More generally, let 

f(x) = a p x p 4- a p _ix p_1 -f . . . + a 0 , 
g(x) = b p x p + 6 p _ix p_1 -f . . . + b 0 

be two polynomials of the same degree p [when one says that / is of degree 
p, this means that a p ^ 0]. Then 26 

26 The relation below persists, with the same proof, if, in f(x)/g(x), one lets x tend 
to infinity through not necessarily integer values. We mainly confine ourselves in 
this chapter to limits in which the independent variable takes “discrete” values. 
The next chapter will show that many of these results extend to the case of 
“continuous” variables (in the sense where, in physics, one speaks of the “discrete 
spectrum” and of the “continuous spectrum” of a luminous source: the first is 
composed of isolated “rays” of zero width, the second of luminous “bands” of 
nonzero width). 
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lim f(n)/g(n) = a p /b p . 

n—+oo 

The proof is the same as above: one divides the two members of the fraction 
by n p and remarks that 1 jn and all its powers tend to 0 as n increases 
indefinitely. 

Beyond these rules of algebraic calculation, there is another simple oper- 
ation which transforms one convergent sequence into another. Let (u n ) be a 
sequence which converges to u and let / be a scalar function defined on a 
neighbourhood of u, except perhaps at u , and such that f(x) tends to a limit 
v when x tends to u. Then f(u n ) tends to v. For all r > 0, there is indeed an 
r' > 0 such that \x — u\ < r* implies | f{x) —v\ < r; then one has \u n — u\ < r' 
for n large, whence \f{u n ) — v\ < r for n large. 

If in particular a function f is continuous at a point a, then 

(8.2) lim u n = a =^> lim f(u n ) = /(a), 

a fundamental result even though almost trivial (i.e. following directly from 
the definitions). 
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§2. Absolutely convergent series 

9 — Increasing sequences. Upper bound of a set of real numbers 

For a start let us make some remarks on passing to the limit in inequalities, 
essential for understanding axiom (IV) of n° 1. 

First, it is obvious that if a sequence of numbers, positive for n large, 
converges, then its limit is again positive; the reader may provide the € and 
the N necessary for a textbook proof ... It follows from this that 

(9.1) a < u n < b for all n large => a < limw n < 6, 

as one sees on considering the sequences u n — a and b — u n . Similarly 

(9.2) u n < v n for all n large => limu n < limu n 

since the terms of the sequence v n — u n are positive for n large. 

In other words, weak inequalities are preserved under passage to the limit. 
Not so for strict inequalities: one has 1/n > 0 for all n, but lim 1/n = 0. 
Without explicit evidence to the contrary, passing to the limit transforms 
strict inequalities into weak inequalities: the inequality 1/n > —2 is preserved 
in the limit since there exists a number r > 0 such that 1/n > —2 + r for all 
n, so that the limit is > — 2 -f r > —2. 

These results, though trivial, bring us back to axiom (IV) for M mentioned 
in n° 1 of this chapter. Consider an increasing sequence 

(9.3) u\ < U 2 < . . . < u n < u n + 1 < . . . 

of real numbers. For a sequence to converge, it is clearly necessary that, 
increasing or not, there exists a positive number M such that \u n \ < M for 
all n, i.e. that the sequence should be bounded. In the case of interest this 
means the existence of numbers M which majorise the sequence, i.e. satisfy 
M > u n for all n, the weak inequality being essential in what follows. One 
also says that M is a majorant of the given sequence and that the latter is 
majorised by M, or majorised for short if one does not want to specify M 
exactly, or also bounded above. 

Suppose now that the sequence (3) converges to a limit u. By property 
(1) above, the relation 

(9.4) u p < M (for all p) implies u < M. 

Since on the other hand u p < u n for all n > p, one sees similarly, on passing 
to the limit over n, that also 

(9.5) Up < u for all p. 

The relation (4) shows that every majorant is > u, and (5) that u is one of 
these majorants; conclusion: 
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(9.6) If an increasing sequence (u n ) converges , then the set of its 
majorants possesses a least element , namely the limit of the given 
sequence . 

Conversely: 

(9.7) Let ( u n ) be an increasing sequence, and bounded. Suppose that 
the set of its majorants possesses a least element u. Then u n con- 
verges to u. 

Take a number r > 0. Since u is the least number which majorises the se- 
quence, the number u — r does not majorise it. There is therefore an index p 
such that u—r<u p . Since the sequence is increasing, one again has u—r < u n 
for all n > p. But since u majorises the sequence, one has finally 

(9.8) u — r < u n < u for all n > p, 

which establishes (7). 

We have no alternative. When a “magnitude”, as one used to call it, 
increases constantly but not indefinitely , i.e. remains below a certain finite 
value, then common sense - the most widely spread out thing in the world 
according to a philosopher and mathematician who believed only what he 
could prove or verify himself -, common sense, then, indicates that this mag- 
nitude must necessarily accumulate towards a limit. In mathematical terms: 
every increasing bounded- above sequence converges. 

For example, common sense indicates that the sequence 

3 3.1 3.14 3.141 3.1415 3.14159 ... 

must converge to something. Alas, this something is not rational, so common 
sense will not help us at all if we know only about Q. This will not wipe out 
any of the banalities which we have already established in this chapter, since 
they rely only on axioms (I), (II) and (III) common to Q and to R, including 
(6) and (7); but to go further one clearly needs an axiom specific to R so 
as not to have false theorems such as: “every bounded increasing sequence 
of rational numbers converges to a rational limit” or, what would hardly be 
better, under penalty of being unable to attribute a limit to almost all the 
convergent sequences that one meets in analysis. 

It is clearly the axiom (IV) of n° 1 that we lack. This affirms that if a 
nonempty set E C R, for example the set of numbers of the form u n in what 
precedes, is bounded above, then the set of numbers which majorise it, its 
majorants, possesses a least element, its least upper bound. This makes the 
following theorem obvious, by (7): 

Theorem 2. For an increasing sequence of real numbers to converge it is 
necessary and sufficient that it be bounded above. Its limit is then the least 
number which majorises it, i. e. the least upper bound of the set of its terms. 
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The crucial point masked by this statement is the postulate that, among all 
the major ants of the given sequence, there exists a number smaller than all 
the others. 

There is a similar statement for decreasing sequences: such a sequence 
converges if and only if it is minorised , i.e. if there exist numbers m less than 
all its terms. Its limit is then the largest of these minorants i.e. the greatest 
lower bound of the set of its terms. 

To clarify the reader’s thoughts, it is indispensable to introduce, or to re- 
vise more systematically than we have done so far, some very easy definitions 
in constant use. 

One says that a set E C M is bounded above , or majorised , if there exists 
a number M such that x < M for all x G E; one then says that M majorises 
E, or is a majorant of E, or that E is majorised by M. There are analogous 
definitions for sets bounded below , or minorised , and numbers which minorise 
the set, etc. Finally, one says that a set E C C is bounded if there exists a 
number M > 0 such that \x\ < M for all x e E. 

Let E C 1 be a set bounded above and let M and M' be two majorants 
of E. If M < M', the relation {x £ E ==>• x < M} is clearly stronger than 
the similar relation with M f . For example, it is not without interest to know 
that, of the human species, everyone dies before attaining the age of 500 
years, but to avoid surprises it is better to know that everyone dies before 
250 years. This information again not being, it would seem, the best possible, 
one might try to determine as small an age as possible before which everyone 
dies. This would be the precise least upper bound of human life. Whence the 
concept of the least upper bound of a set E C R bounded above: it is a 
number u — sup(E) satisfying the two following conditions: 

(SUP 1) x < u for all x E E, i.e. u majorises E; 

(SUP 2) u < M for any other majorant M of E. 

In other words, sup(E) is the least majorant of E. 

One could replace (SUP 2) by 

(SUP 2’) for all r > 0 there exists an x G E such that u — r < x. 

If indeed u is the least possible majorant, then the number u — r does not 
majorise E, whence the existence of x. If conversely (SUP 2’) holds, then 
every M majorising E majorises, for any r>0, an x > w - r, so majorises 
u — r for any r > 0, so majorises u (modified Archimedes’ axiom), whence 
(SUP 2). 

(SUP 2’) also implies that u is the limit of a sequence of elements of E: 
choose x n E E such that u — 1/n < x n . 

As we have seen above, the difficulty in proving that an increasing se- 
quence tends to a limit disappears the moment the existence of the least 
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upper bound is established. The problem is solved by axiom (IV) of n° 1 
which the majority of authors call Bolzano’s “Theorem” because he was the 
first, it seems, to formulate it more or less clearly, in 1817. He did not prove 
it, no doubt because it seemed too obvious to him, and, in reality, because no 
one of his time yet had a sufficiently clear idea of the concept of real number 
to be able to provide a correct proof. Hardly a surprising situation: if you 
want to prove such a “theorem” you have to rely on other previous results; 
now the axioms (I), (II) and (III) of n° 1 clearly have not the least chance 
of sufficing, since, if such were the case, they would prove that Bolzano’s 
Theorem is valid in Q; the invention of real numbers would then be totally 
superfluous. If one really wants to make the existence of least upper bounds a 
theorem , one needs to have either an axiom valid in R but not in Q, as do those 
who prefer the “nested intervals” axiom (for us, a theorem of Chap. Ill), or a 
rigorous construction of R, for example that provided by Dedekind sections 
of which we have spoken in the introduction and at the end of n° 1. 

It may all the same be interesting to show that Bolzano’s Theorem, as it 
is called, which trivially implies Theorem 2, is, conversely, a consequence of 
it, so that Theorem 2 might have also have been taken as axiom (IV). 

Bolzano’s “Theorem” (1817). Every nonempty bounded-above subset E 
of R possesses one and only one least upper bound. 

Uniqueness is clear: one cannot see how a set of numbers, the majorants 
of E , which possesses a least element could possess two different ones, for 
each has to be smaller than the other. Paul Klee once depicted two naked 
bureaucrats bowing to each other with their backs at the horizontal , each 
thinking the other of a higher rank than his own. This has not prevented some 
authors, including the present one formerly, from providing a textbook proof 
of the uniqueness of the least upper bound. Now let us prove its existence. 

Let M be a major ant of E. There are integers n E Z which majorise E, for 
example those larger than M. There are also integers which do not majorise 
E since E is nonempty; they are all < M, so one can consider the largest of 
them, say uq. It does not majorise E, but uo + 1 majorises E 1 for otherwise 
uo would not be the largest possible. 

Among the numbers uo 4- n/10, where n is an integer > 0, let u\ be the 
largest of those which do not majorise E ; one has n < 9 for this number since 
uo + 10/10 majorises E. So uo <u\, u\ does not majorise E, but u\ + 1/10 
does. 

Similarly, let U 2 be the largest of the numbers of the form u\ + n/100 
which do not majorise E. One has n < 9 since u\ + 10/100 majorises E, 
u\ < U 2 , U 2 does not majorise E, but U 2 + 1/100 does. 

On repeating this construction indefinitely, one obtains an increasing se- 
quence of numbers u n possessing the following properties: 

(i) u n does not majorise E, (ii) u n + l/10 n majorises E. 
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Since u n < M for all n, the sequence u n tends to a limit u, by virtue of 
Theorem 2. Let us show that u satisfies the conditions (SUP 1) and (SUP 2’) 
which characterise the least upper bound of E. 

For all x G E we have x < u n 4- l/10 n for any n, since the right hand 
side majorises E . On the other hand, it converges to u. The remarks at the 
beginning of this n° then show that x < u, whence (SUP 1). 

Since u n does not majorise E, for each n there is an x n £ E such that 
Vb n <x n . One has also x n < u n + l/10 n since the right hand side majorises E. 
In consequence, limx n = limu n = tq whence (SUP 2’), and this completes 
the proof. 

We have quite deliberately used decimal numbers in the preceding proof, 
to construct successive terminating decimal expansions uo, Ui, U 2 , . . . for u. 
To be precise, if one writes the default decimal expansion of each x E E in 
the form 

X = X0.X1X2 . . . 

with an integer part xq and decimals xi,X 2 , - • • between 0 and 9, then one 
obtains uq, u 1 , etc. by the following procedure: uq is the maximum value 
taken by Xo when x varies in E; u\ has integer part uo and its first decimal 
(the following are zeros) is the maximum value of x\ when x runs through 
the set Eq of x E E such that xq = the decimal expansion of U 2 starts 
like that of u\ , but has one more decimal, namely the maximum value of X2 
when x runs through the set E\ C Eq C E of x G E such that xq,x\ = tq, 
and so on. In other words, one considers the x G E whose integer part is a 
maximum, then, among them , those whose first decimal is a maximum, then, 
among these, those whose second decimal is a maximum, etc. On pursuing 
this construction indefinitely we find the successive decimals of the least upper 
bound of E which we seek. 

This construction allows one to prove the theorem “without knowing any- 
thing” subject to accepting that any nonterminating expansion decimal cor- 
responds to a real number; but this comes back to accepting either axiom 
(IV) of n° 1, or Theorem 2 which, as we have seen, is equivalent to it. More 
ingenious arguments will never let you escape this. On the contrary, it is ax- 
iom (IV) which justifies the decimal representation of the real numbers. 

The concept of least upper bound also applies to the case of a family ( tq ), 
i E /, of real numbers - in other words, up to notation, of a map of the set I 
into 3R; the least upper bound of the set E of ui (x G E <£=> there exists an 
i such that x — iq) is denoted by 



sup Ui 
iei 

or simply sup {uf) if there is no fear of ambiguity. This is the least number 
which majorises all the tq or again, among the numbers which majorise it, 




